Salesforce API with Simple Salesforce For Python
- By : Mydatahack
- Category : Data Engineering, Tools and Utilities
- Tags: Python, Salesforce API
Python has a plethora of modules that makes programming fun and easy. If you need to use Salesforce API with Python, the simple-salesforce module is your best friend. The module takes care of boring stuff like authentication and let you use different APIs. You can check the documentation and source code here . If you want to know what methods are available for the module, take a look at the api.py file.
How it works
All you need to do is to create a Salesforce object with a user name, password, security token and the version of API you are using. If the instance is not production, you need to add sandbox=True. By using this object, you can call different APIs (Rest API, APEX Rest, Bulk API and SOQL query) with specialised methods with query url or SOQL query in their arguments. In summary, calling Salesforce API only requires 2 steps.
API steps
(1) Create Salesforce Object
security_token=<token>, version='41.0', sandbox=True)
(2) Make API calls with various methods
result = sf.restful('limits/', params=None)
print(result)
# (2) Make a SOQL query to get the record count of lead object
result = sf.query('Select count() From Lead')
print('No of records: {}'.format(result['totalSize']))
# (3) Make an HTTP rquest to an APEX Rest Endpoint
result = sf.apexecute('Account/accountId')
print(result)
Simple Salesforce Application
Now that we understand how awesome Simple Salesforce is, I have a small application to read metadata from a Salesforce object and create a Create Table statement for Postgres database. It takes an object name as an argument (not case sensitive). You can call the script as ‘python script.py account’ to generate a text file that contains a create table statement for the account object and csv file containing its metadata.
Here comes the code.
Enjoy!
Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | from simple_salesforce import Salesforce import json import sys object_name = sys.argv[1] print ("Target Object: {}".format(object_name)) # (1) Set data type mapping mapping =\ {'id':'varchar',\ 'boolean':'bool',\ 'reference':'varchar',\ 'string':'varchar',\ 'picklist':'varchar',\ 'textarea':'varchar',\ 'double':'decimal',\ 'phone':'varchar',\ 'url':'varchar',\ 'currency':'double',\ 'int':'int',\ 'datetime':'timestamp',\ 'date':'timestamp',\ 'email':'varchar',\ 'multipicklist':'varchar',\ 'percent':'decimal',\ 'decimal':'decimal',\ 'long':'bigint',\ 'address':'varchar',\ 'masterrecord':'varchar',\ 'location':'varchar',\ 'encryptedstring':'varchar'} # (2) Create Salesforce Object sf = Salesforce(username=<user>, password=<pw>, security_token=<token>,\ version='41.0', sandbox=True) # data = sf.query("Select Id From Account Limit 10") # (2) Call Rest API for meta data md = sf.restful("sobjects/{}/describe/".format(object_name), params=None) # (3) Create csv file with column list flat_table = [] for record in md['fields']: tmp = [] tmp.append(record['name']) tmp.append(str(record['length'])) tmp.append(record['label']) tmp.append((record['type']).lower()) tmp.append(str(record['precision'])) tmp.append(str(record['scale'])) flat_table.append(tmp) headers = ['name', 'length', 'label', 'type', 'precision', 'scale'] csv_file_name = "{}_metadata.csv".format(object_name) file = open(csv_file_name, 'w') file.write(','.join(headers)+ '\n') for row in flat_table: file.write(','.join(tuple(row)) +'\n') file.close() print("Created Medatada CSV file") # (4) Size paramter set def set_parameter(column_type, record): '''Takes column type and json record to determine column length/ percision''' param = '' if column_type == 'varchar' or column_type == 'Unknown': param = "(" + str(record['length']) + ")" elif column_type == 'decimal': param = "(" + str(record['precision']) + "," + str(record['scale']) + ")" else: param = '' return param # (5) Set Primary Key def set_primary_key(column_name): '''If id, then add primary key''' param = '' if column_name == 'id': param = 'Primary Key' else: param = '' return param # (6) Mapping function def map_columns (json_data): ''' Takes json data from rest API and convert to Postgres Create Table Statement ''' field_list = [] counter = 1 for record in json_data['fields']: tmp = [] column_name = record['name'].lower() try: column_type = mapping[record['type'].lower()] except: column_type = 'Unknown' column_param = set_parameter(column_type, record) primary_key_param = set_primary_key(column_name) tmp.append(column_name) tmp.append(column_type) tmp.append(column_param) tmp.append(primary_key_param) counter += 1 if counter <= len(json_data['fields']): tmp.append(",") field_list.append(tmp) return field_list field_list = map_columns(md) ddl_file_name = "{}_ddl.txt".format(object_name) file = open(ddl_file_name, 'w') file.write('Create Table z_salesforce.{} (\n'.format(object_name)) for row in field_list: file.write(' '.join(tuple(row)) +'\n') file.write(')') file.close() print("Generated Create Table Statement") |
Hi,
Awesome post! I’m fairly new to python and Salesforce. I’m using data loader with batch files to download queries into csv. Then, with python pandas script, I manipulate those csv files. Once I have the desired csv format, I use batch file via Data Loader to update the records.
Using Data Loader takes a long time to do data extract and update (by default, it loops through the objects like a list until it gets to the object I want. Ex. Before it gets to ‘cases’ it goes through Accounts, B…C…. etc..). All in all, a small routine takes 5 minutes (10-20 records).
That’s why I’m exploring python simple-salesforce. Thing is, I’ve been looking for instructions on how to save query results into csv files and can’t seem to find a clear enough guide.
Hopefully you can shed some light on to this with your expert opinion? Thank you!
Hello,
Thanks for your comment. sorry for my late reply!
The query result comes in JSON. SOQL is a form of making an API call. In Python, it is treated as dictionary. The best way to understand the data structure from the SOQL query result is to print it in a pretty json format.
2
print(json.dumps(query_results, indent=4))
Now you can see the actual record is in the records node. All you have to do is to put them into a list, create a dictionary and whack it in the pandas df. Once you finish processing with pandas, converting it to json and uploading it can work. However, writing back to Salesforce is slow. I did data migration by using DataStage to move legacy data into Salesforce. It took days to complete the task because of slow write speed.
If you are getting data from Salesforce, transforming and writing it back, it may be worthwhile to explore an ETL tool. You don’t really need to have an expensive infrastructure. For example, Open Talend Studio is free and you can run the job on your desktop. Or your company probably has an ETL tool that you may be able to use.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
id = []
firstname = []
lastname = []
isdeleted = []
convertedaccountid = []
createddate = []
for record in query_results['records']:
id.append(record['Id'])
firstname.append(record['FirstName'] or '')
lastname.append(record['LastName'] or '')
isdeleted.append(str(record['IsDeleted']))
convertedaccountid.append(record['ConvertedAccountId'] or '')
createddate.append(record['CreatedDate'])
record_dict = {}
record_dict['Id'] = id
record_dict['FirstName'] = firstname
record_dict['LastName'] = lastname
record_dict['isDeleted'] = isdeleted
record_dict['convertedaccountid'] = convertedaccountid
record_dict['createddate'] = createddate
import pandas as pd
df = pd.DataFrame(record_dict)
print(df.head())
Cheers!