Salesforce API with Simple Salesforce For Python

Python has a plethora of modules that makes programming fun and easy. If you need to use Salesforce API with Python, the simple-salesforce module is your best friend.  The module takes care of boring stuff like authentication and let you use different APIs. You can check the documentation and source code here . If you want to know what methods are available for the module, take a look at the api.py file.

How it works

All you need to do is to create a Salesforce object with a user name, password, security token and the version of API you are using. If the instance is not production, you need to add sandbox=True. By using this object, you can call different APIs (Rest API, APEX Rest, Bulk API and SOQL query) with specialised methods with query url or SOQL query in their arguments. In summary, calling Salesforce API only requires 2 steps.

API steps

(1) Create Salesforce Object

sf = Salesforce(username=<uname>, password=<pw>,\
 security_token=<token>, version='41.0', sandbox=True)

(2) Make API calls with various methods

# (1) Make a direct API call to list avaialble REST API Versions
result = sf.restful('limits/', params=None)
print(result)

# (2) Make a SOQL query to get the record count of lead object
result = sf.query('Select count() From Lead')
print('No of records: {}'.format(result['totalSize']))

# (3) Make an HTTP rquest to an APEX Rest Endpoint
result = sf.apexecute('Account/accountId')
print(result)

Simple Salesforce Application

Now that we understand how awesome Simple Salesforce is, I have a small application to read metadata from a Salesforce object and create a Create Table statement for Postgres database. It takes an object name as an argument (not case sensitive). You can call the script as ‘python script.py account’ to generate a text file that contains a create table statement for the account object and csv file containing its metadata.

Here comes the code.

Enjoy!

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
from simple_salesforce import Salesforce
import json
import sys

object_name = sys.argv[1]
print ("Target Object: {}".format(object_name))

# (1) Set data type mapping
mapping =\
{'id':'varchar',\
'boolean':'bool',\
'reference':'varchar',\
'string':'varchar',\
'picklist':'varchar',\
'textarea':'varchar',\
'double':'decimal',\
'phone':'varchar',\
'url':'varchar',\
'currency':'double',\
'int':'int',\
'datetime':'timestamp',\
'date':'timestamp',\
'email':'varchar',\
'multipicklist':'varchar',\
'percent':'decimal',\
'decimal':'decimal',\
'long':'bigint',\
'address':'varchar',\
'masterrecord':'varchar',\
'location':'varchar',\
'encryptedstring':'varchar'}

# (2) Create Salesforce Object
sf = Salesforce(username=<user>, password=<pw>, security_token=<token>,\
                version='41.0', sandbox=True)

# data = sf.query("Select Id From Account Limit 10")
# (2) Call Rest API for meta data
md = sf.restful("sobjects/{}/describe/".format(object_name), params=None)

# (3) Create csv file with column list
flat_table = []
for record in md['fields']:
    tmp = []
    tmp.append(record['name'])
    tmp.append(str(record['length']))
    tmp.append(record['label'])
    tmp.append((record['type']).lower())
    tmp.append(str(record['precision']))
    tmp.append(str(record['scale']))
    flat_table.append(tmp)

headers = ['name', 'length', 'label', 'type', 'precision', 'scale']

csv_file_name = "{}_metadata.csv".format(object_name)
file = open(csv_file_name, 'w')
file.write(','.join(headers)+ '\n')

for row in flat_table:
    file.write(','.join(tuple(row)) +'\n')
file.close()
print("Created Medatada CSV file")

# (4) Size paramter set
def set_parameter(column_type, record):
    '''Takes column type and json record to determine column length/ percision'''
    param = ''
    if column_type == 'varchar' or column_type == 'Unknown':
        param = "(" + str(record['length']) + ")"
    elif column_type == 'decimal':
        param = "(" + str(record['precision']) + "," + str(record['scale']) + ")"
    else:
        param = ''
    return param
 
# (5) Set Primary Key
def set_primary_key(column_name):
    '''If id, then add primary key'''
    param = ''
    if column_name == 'id':
        param = 'Primary Key'
    else:
        param = ''
    return param

# (6) Mapping function
def map_columns (json_data):
    ''' Takes json data from rest API and convert to Postgres Create Table Statement '''
    field_list = []
    counter = 1
    for record in json_data['fields']:
        tmp = []
        column_name = record['name'].lower()
        try:
            column_type = mapping[record['type'].lower()]
        except:
            column_type = 'Unknown'
        column_param = set_parameter(column_type, record)
        primary_key_param = set_primary_key(column_name)

        tmp.append(column_name)
        tmp.append(column_type)
        tmp.append(column_param)
        tmp.append(primary_key_param)
        counter += 1
        if counter <= len(json_data['fields']):
            tmp.append(",")
        field_list.append(tmp)
    return field_list

field_list = map_columns(md)
ddl_file_name = "{}_ddl.txt".format(object_name)
file = open(ddl_file_name, 'w')

file.write('Create Table z_salesforce.{} (\n'.format(object_name))

for row in field_list:
    file.write(' '.join(tuple(row)) +'\n')

file.write(')')
file.close()

print("Generated Create Table Statement")

Hi,
Awesome post! I’m fairly new to python and Salesforce. I’m using data loader with batch files to download queries into csv. Then, with python pandas script, I manipulate those csv files. Once I have the desired csv format, I use batch file via Data Loader to update the records.
Using Data Loader takes a long time to do data extract and update (by default, it loops through the objects like a list until it gets to the object I want. Ex. Before it gets to ‘cases’ it goes through Accounts, B…C…. etc..). All in all, a small routine takes 5 minutes (10-20 records).
That’s why I’m exploring python simple-salesforce. Thing is, I’ve been looking for instructions on how to save query results into csv files and can’t seem to find a clear enough guide.
Hopefully you can shed some light on to this with your expert opinion? Thank you!

Hello,

Thanks for your comment. sorry for my late reply!

The query result comes in JSON. SOQL is a form of making an API call. In Python, it is treated as dictionary. The best way to understand the data structure from the SOQL query result is to print it in a pretty json format.

1
2
query_results = sf.query('Select id, firstname, lastname, isdeleted, convertedaccountid, createddate From Lead limit 5')
print(json.dumps(query_results, indent=4))

Now you can see the actual record is in the records node. All you have to do is to put them into a list, create a dictionary and whack it in the pandas df. Once you finish processing with pandas, converting it to json and uploading it can work. However, writing back to Salesforce is slow. I did data migration by using DataStage to move legacy data into Salesforce. It took days to complete the task because of slow write speed.

If you are getting data from Salesforce, transforming and writing it back, it may be worthwhile to explore an ETL tool. You don’t really need to have an expensive infrastructure. For example, Open Talend Studio is free and you can run the job on your desktop. Or your company probably has an ETL tool that you may be able to use.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
query_results = sf.query('Select id, firstname, lastname, isdeleted, convertedaccountid, createddate From Lead')
id = []
firstname = []
lastname = []
isdeleted = []
convertedaccountid = []
createddate = []
for record in query_results['records']:
    id.append(record['Id'])
    firstname.append(record['FirstName'] or '')
    lastname.append(record['LastName'] or '')
    isdeleted.append(str(record['IsDeleted']))
    convertedaccountid.append(record['ConvertedAccountId'] or '')
    createddate.append(record['CreatedDate'])
record_dict = {}
record_dict['Id'] = id
record_dict['FirstName'] = firstname
record_dict['LastName'] = lastname
record_dict['isDeleted'] = isdeleted
record_dict['convertedaccountid'] = convertedaccountid
record_dict['createddate'] = createddate
import pandas as pd
df = pd.DataFrame(record_dict)
print(df.head())

Cheers!

Data Engineering
Sending XML Payload and Converting XML Response to JSON with Python

If you need to interact with a REST endpoint that takes a XML string as a payload and returns another XML string as a response, this is the quick guide if you want to use Python. If you want to do it with Node.js, you can check out the post …

Data Engineering
Sending XML Payload and Converting XML Response to JSON with Node.js

Here is the quick Node.js example of interacting with a rest API endpoint that takes XML string as a payload and return with XML string as response. Once we get the response, we will convert it to a JSON object. For this example, we will use the old-school QAS (Quick …

Data Engineering
Downloading All Public GitHub Gist Files

I used to use plug-ins to render code blocks for this blog. Yesterday, I decided to move all the code into GitHub Gist and inject them from there. Using a WordPress plugin to render code blocks can be problematic when update happens. Plugins might not be up to date. It …