After setting up VPC, Internet Gateway, Subnets, Route Tables (see here), we need to set up Network Access Control Lists (NACLs) for the subnets and Security Group for EC2 and RDS. This is a step in How To Create Your Personal Data Science Computing Environment In AWS. NACLs are at …
Let’s launch a Postgres RDS in AWS. You will get 750 hours of Amazone RDS Single-AZ db.t2.micro instance as part of 12 month free tier. You also have a bit of options (MySQL, MariaDB, or SQL Server). I am choosing Postgres here. Amazon RDS makes it easy to set up …
Let’s launch an Linux EC2 Instance From AMI. In this example, I am launching Linux Centos 7 from Amazone Machine Image. You can choose whichever OS you want for the use case we’ve been working on here. As a reminder, this is the plan. We are going to launch EC2 …
Let’s create Elastic Block Store (EBS) volume and attach it to Linux. An EC2 instance comes with a storage. But, this storage only persists with the instance. If you need to terminate the instance and start a new one, you will loose the data. If you keep the data in …
To create VPC, you first need to specify IP range. Each subnet takes IP range within the IP range of VPC. Before you decide the IP range, it is a good practice to plan first. You need to use CIDR notation for IP range. There is a cool tool to …
After you create an AWS account, the best practice is to lock away the root account credential (which you used to create the account) and never use it to do your daily tasks. Instead of using the root account, you should create an admin account with all access. To create …
Once you create an awesome data science application, it is time for you to deploy it. There are many ways to productionise them. The focus here is deploying Spark applications by using the AWS big data infrastructure. From my experience with the AWS stack and Spark development, I will discuss …