Data Engineering

How to Ingest Data From MongoDB with Node.js

Data ingestion from a NoSQL database often involves the denormalisation of their schema-less data before loading into a relational database. In this post, we will try to grab the restaurant data from MongoDB, denormalise the collection into a parent and a child table, and load them into Postgres. This exercise …

Data Engineering

REST API Data Ingestion with Node.js

The classic REST API data ingestion pattern is (1) to make an API call to the endpoint, (2) get the data, (3) transform it to a structured table and (4) load it to a database. Let’s have a go at it with Node.js. We are using JSONPlaceholder which offers a …

Data Engineering

Converting JSON to CSV and Loading it to Postgres with Node.js

To convert JSON to CSV, I love using json2csv. It really does all the hard work of working the JSON structure out and converting it to a flat file. For nested JSON elements, you can simply specify them by the dot notation (like transaction.itemId). When it contains an array element, …

Data Engineering

Converting CSV to JSON and Loading it to Postgres with Node.js

To convert csv to json, Node has an awesome module, csvtojson. It takes a json file and convert it to csv asynchronously. Once we convert csv to json, let’s load it to a Postgres table with jsonb data type. Postgres supports JSON data and you can query it (see the …

Data Engineering

Bulk Loading Postgres with Node.js

The fastest way to bulk load data into Postgres is to call Copy, which is a SQL command to load data into a table from a flat file. To connect to Postgres with Node.js, we can use the node-postgres module (pg). To use the copy function, we can use the …

Data Engineering

A Comprehensive Guide for Reading and Writing JSON with Python

A Comprehensive Guide for Reading and Writing JSON with Python The json module enables you to read JSON object from a file or HTTP response and write it to a file. It is worthwhile to spend a little bit of time to understand a few key functions that are often …

Data Engineering

Testing and Prototyping for REST API Data Ingestion with JSONPlaceholder

JSONPlaceholder is a web service that offers REST API endpoints for example JSON data. If you needs to experiment with REST API quickly, this is a really great tool that you can use. It supports all the HTTP verbs. As for testing and prototyping REST API data ingestion, you can …

Data Engineering

How to Ingest FullStory Data Export Extracts with Python

If you are interested in user tracking on your website, FullStory is a pretty good option. You can sign up for the free version here. The free version includes heaps of cool features. When you first sign up, you can try all the Pro Edition features for 2 weeks, too. …

Data Engineering

How to Bulk Load Data with JDBC and Python

Let’s do data bulk load by using JDBC and Python. The aim of this post is pretty much the same as the previous one with ODBC. We are going to export a table into a csv file and import the exported file into a table by using JDBC drivers and …

Data Engineering

How to Bulk Load Data with ODBC and Python

I think Hello World of Data Engineering to make an one-to-one copy of a table from the source to the target database by bulk-loading data. The fastest way to achieve this is exporting a table into a CSV file from the source database and importing a CSV file to a …