DataStage: How To Add Redshift JDBC Driver

In order to use a JDBC driver, you need to download the JDBC and set up the configuration files (see here). In this post, we will discuss how to add Redshift JDBC driver to DataStage server and configure it.

Steps

(1) Download Redshift JDBC driver from here. In my experience, the latest version (4.2) was not compatible. I had to use the older version (4.1).

(2) It is recommended to install the driver to the tmp folder first and move to theĀ 3rd-party-drivers folder. In this example, I set the installation path as /opt/IBM/InformationServer/Server/DSEngine/. You need to check your installation path.

mv /tmp/RedshiftJDBC42-1.2.1.1001.jar /opt/IBM/InformationServer/Server/DSEngine/3rd-party-drivers/

(3) Once the driver is moved to the 3rd-party-drivers folder, you need to editĀ the isjdbc.config file in the DSEngine folder. The config file requires two parameters, ClASSPATH and CLASSNAME. Both can be obtained in the Redshift JBDC driver download page here.

CLASSPATH: Add the path to the JBDC jar file as in ‘/opt/IBN/InformationServer/Server/DSEngine/3rd-party-drivers/RedshiftJDBC42-1.2.1.1001.jar’

CLASSNAME: Add the class name as in the documentation like ‘com.amazon.redshift.jdbc’

Notes

To insert or update records, make sure to tick auto-commit in the JDBC stage. If this option is not ticked, it will lock the table and the job hangs.

Inserting or updating records in Redshift through JDBC (or even ODBC) driver with DataStage is extremely slow. To achieve fast load, you need to build the jobs that do the best Redshift practice (Loading Data into Redshift), which is not an easy task. DataStage is not very compatible with the AWS ecosystem. On the other hand, Informatica provides the excellent Redshift connector that works like a charm. Check out the blog post on Informatica and Redshift here.

ETL
Running Jobs with Informatica Cloud REST API

Informatica Cloud offers REST API for us to interact with the platform programmatically. At this point in time, the latest official reference is found here. In this post, we are going to use Python to trigger jobs through API. Triggering a job is a 2-step process. We need to authenticate …

ETL
Tips and Troubleshooting For Uploading CSV to Database In Talend

There will be time when you want to upload a big csv file (with many rows and hundreds of columns) to a relational database table. Talend Open Studio is an open source ETL tool that I use regularly to do odd jobs like that. I like using it because it …

ETL
How To Rename Mapping Job In Informatica Cloud

Mappings are where all the magic happens in Informatica Cloud. When I started using it, it took me a while to work out how to rename a mapping job. Since then, a few people asked me the same question. So, I decided to write about it. This is probably the …