Data Engineering
2

Salesforce API with Simple Salesforce For Python

Python has a plethora of modules that makes programming fun and easy. If you need to use Salesforce API with Python, the simple-salesforce module is your best friend.  The module takes care of boring stuff like authentication and let you use different APIs. You can check the documentation and source …

ETL

How To Configure ODBC Connector For Informatica Cloud Secure Agent

Informatica does not have a dedicated Postgres database connector. Therefore, we need to use the ODBC connector. In this post, I will discuss how to configure Postgres ODBC in both Linux and Windows servers for the Informatica Cloud ODBC connector. Linux Server (Red Hat) There are a few instructions, but …

Data Engineering

How To Get Data From Google Analytics With Java

In the previous post, we discussed a strategy to ingest Google Analytics data and presented the Python code example (How To Get Data From Google Analytics With Python). Generally speaking, I prefer using Python code to ingest API data, however, every once in a while, we get requested to write …

Data Engineering

How To Get Data From Google Analytics With Python

When you ingest data from Google Analytics, you need to create a series of reports based on GA dimensions and metrics. The granularity is determined by dimensions you add in the report. The most important thing is to understand business requirements before start ingesting data. Good requirement analysis will enable …

Data Engineering

How To Get Facebook Data With Python

By using Facebook Graph API, we can get the feed of posts and links published by the specific page, or by others on this page as well as likes and comments (feed api). I have written a python script to scrape the feed info in the JSON format and turn …

Data Engineering

How To Get Twitter Data With Python

In this post, we will discuss how to use Python to grab publicly available Twitter post data (from any user you specify) and convert it into a tabular format so that we can analyse the data through Excel or insert them into a relational database. Python has a package that …

DataStage

DataStage: Loop With Transformer

The Transformer stage has the built-in looping functionality where you can use Stage Variables and Loop Conditions to construct looping logics. In this post, we will present 3 different examples. Ranking Aggregation Vertical Pivot Before going into the examples, here are the useful variables for loop construction. @ITERATION – System …

DataStage

DataStage: Remove Leading & Trailing Lines in Flat File

When flat file has leading and trailing lines that are not part of the table, we can use the filter in the flat file stage to remove them. As an example, the file below has a leading and trailing lines. We want remove them with the flat file stage. Output Steps …

DataStage

DataStage: Join vs Lookup vs Merge

DataStage has three processing stages that can join tables based on the values of key columns: Lookup, Join and Merge.  In this post, we discuss when to choose which stage, the difference between these stages, and development references when we use those stages. Use the Lookup stage when: Having a …

ETL

Informatica Cloud: How To Use Hierarchy Parser To Transform JSON File

Hierarchy Parser in the Informatica Cloud mapping designer can transform JSON or XML files into structured table (see instruction here). In this post, we will transform the JSON file obtained from Google Geocoding API. Geocoding API turn addresses (1600 Amphitheatre Prakway Mountain View CA) into geographic coordinates (latitude: 37.422, Longitude: -122.085 etc) …