Data Engineering

How to Ingest Data From MongoDB with Node.js

Data ingestion from a NoSQL database often involves the denormalisation of their schema-less data before loading into a relational database. In this post, we will try to grab the restaurant data from MongoDB, denormalise the collection into a parent and a child table, and load them into Postgres. This exercise …

Data Engineering

A Comprehensive Guide for Reading and Writing JSON with Python

A Comprehensive Guide for Reading and Writing JSON with Python The json module enables you to read JSON object from a file or HTTP response and write it to a file. It is worthwhile to spend a little bit of time to understand a few key functions that are often …

Data Engineering

How to Ingest FullStory Data Export Extracts with Python

If you are interested in user tracking on your website, FullStory is a pretty good option. You can sign up for the free version here. The free version includes heaps of cool features. When you first sign up, you can try all the Pro Edition features for 2 weeks, too. …

DBA

Index JSON In Postgres

To maximise query efficiency for a relational database is to index the columns that are often used for joining or conditions. The awesome thing about querying JSON in Postgres is that you can index it to further optimise query performance. In the previous post, we had a look at the …

DBA

How Postgres JSON Query Handles Missing Key

When we transform JSON to a structured format with a programming language in JSON data ingestion, we have to handle missing key. This is because JSON is schema-less and it doesn’t always have the same keys in all records as opposed to relational database tables. In Python, the missing key …

Data Engineering

New JSON Data Ingestion Strategy by Using the Power of Postgres

Postgres always had a JSON support with somehow limited capability before the 9.2 version added the native JSON support. The release of version 9.3 has really taken the JSON feature to the next level with additional constructor and extractor methods. The capability of querying and transforming the JSON data type …

ETL

Informatica Cloud: How To Use Hierarchy Parser To Transform JSON File

Hierarchy Parser in the Informatica Cloud mapping designer can transform JSON or XML files into structured table (see instruction here). In this post, we will transform the JSON file obtained from Google Geocoding API. Geocoding API turn addresses (1600 Amphitheatre Prakway Mountain View CA) into geographic coordinates (latitude: 37.422, Longitude: -122.085 etc) …

DataStage

DataStage: Hierarchical Data Stage Transforming Google Analytics Data

By using Google Analytics Core Reporting API, we can export reports from Google Analytics. To export reports, you need to specify dimensions and metrics. To further explore GA reports, you can use Query Explorer. In this example, we exported the data using the following dimensions and metrics around geographical information …

DataStage

DataStage: Hierarchical Data Stage Transforming JSON Data

Hierarchical Data Stage can parse, compose and transform hierarchical data such as JSON and XML. In this example, we are using the JSON file obtained from Google Geocoding API. Geocoding API turn addresses (1600 Amphitheatre Prakway Mountain View CA) into geographic coordinates (latitude: 37.422, Longitude: -122.085 etc) . The outcome …