How to Bulk Load Data into MySQL with Python
- By : Mydatahack
- Category : Data Engineering, Data Ingestion
- Tags: Bulk Load, MySQL, pymysql, Python
As in any other relational databases, the fastest way to load data into MySQL is to upload a flat file into a table. To do this, MySQL has a LOAD DATA INFILE function. We can use Python to execute this command. To connect to MySQL and execute SQL statements with Python, we will use the pymysql module.
As in the previous post with PostgresSQL, we will first export a table into a csv file and then look at how we can load a csv file to a table.
First of all, let’s import all the modules required for this exercise. The code here works for both Python 2.7 and 3.
1 2 3 | import pymysql import pandas as pd import sys |
Exporting MySQL table into a CSV file
First of all, let’s export a table into CSV file. We are using the Pandas module to convert SQL results into Pandas data frame and write it to a csv file.
The function takes a select query, output file path and connection details. To export an entire table, you can use select * on the target table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | def mysql_to_csv(sql, file_path, host, user, password): ''' The function creates a csv file from the result of SQL in MySQL database. ''' try: con = pymysql.connect(host=host, user=user, password=password) print('Connected to DB: {}'.format(host)) # Read table with pandas and write to csv df = pd.read_sql(sql, con) df.to_csv(file_path, encoding='utf-8', header = True,\ doublequote = True, sep=',', index=False) print('File, {}, has been created successfully'.format(file_path)) con.close() except Exception as e: print('Error: {}'.format(str(e))) sys.exit(1) # Execution Example sql = 'Select * From world.city' file_path = '/tmp/city.csv' host = 'host url' user = 'username' password = 'password' mysql_to_csv(sql, file_path, host, user, password) |
Importing CSV file to MySQL table
Make sure to have autocommit turned on. To upload files, you need to set the local_infile parameter to 1. The function takes a load data infile statement and connection details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | def csv_to_mysql(load_sql, host, user, password): ''' This function load a csv file to MySQL table according to the load_sql statement. ''' try: con = pymysql.connect(host=host, user=user, password=password, autocommit=True, local_infile=1) print('Connected to DB: {}'.format(host)) # Create cursor and execute Load SQL cursor = con.cursor() cursor.execute(load_sql) print('Succuessfully loaded the table from csv.') con.close() except Exception as e: print('Error: {}'.format(str(e))) sys.exit(1) # Execution Example load_sql = "LOAD DATA LOCAL INFILE '/tmp/city.csv' INTO TABLE usermanaged.city\ FIELDS TERMINATED BY ',' ENCLOSED BY '"' IGNORE 1 LINES;" host = 'host url' user = 'username' password = 'password' csv_to_mysql(load_sql, host, user, password) |