How To Get Data From SharePoint With Python

It’s sometimes convenient to have a script to get data from SharePoint. We can automate the user managed data ingesting from SharePoint. For example, business users can upload or update the user managed file and a scheduled ETL task fetch and bring it to the datalake.

Using SharePoint API is easy thanks to this amazing module called sharepy. It helps you to authenticate and make REST calls. Microsoft provides comprehensive documentation on the SharePoint REST API. To understand how it works, start off with the article: How the SharePoint Rest service works. The key to any API programming is to construct REST endpoint correctly. Working with folders and files with REST provides you with information for fetching user managed files from SharePoint.

Gocha!

(1) You need to type username and password in the console with sharepy. To pass authentication parameters in the program, I am using the customised module sharepy_custom (see How To Customise Python Module).

(2) We can get the list of files in a folder. However, adding the parameter for GetFolderByServerRelativeUrl() can be tricky. When you are working with only server url, the parameter can contain ‘/’ (GetFolderByServerRelativeUrl(‘/Shared Documents’))in front of shared document as described here.

r = s.get("https://example.sharepoint.com/_api/web/\
GetFolderByServerRelativeUrl('Shared Documents')/Files"
)

If you are using the REST endpoint with sever and site like http://server/site/, you cannot have ‘/’ in front of Shared Document. Otherwise you will get REST Api error “Server relative urls must start with SPWeb.ServerRelativeUrl.

r = s.get("https://example.sharepoint.com/GroupSite/HR/_api/web/\
GetFolderByServerRelativeUrl('Shared Documents')/Files"
)

(3) Constructing relative url for GetFileByServerRelativeUrl can be tricky. The url includes site name and folder name as below.

r = s.getfile("https://example.sharepoint.com/_api/web/\
GetFileByServerRelativeUrl('/Shared Documents/Form.docx')/$value"
,\
 filename = 'document.doc')
r = s.getfile("https:// example.sharepoint.com/GroupSites/HR/_api/web/\
GetFileByServerRelativeUrl('/GroupSites/HR/Shared Documents/Team.xlsx')/$value"
,\
 filename = 'team.xlsx')

(4) Sharepy only works for Python 3. If you use Python 2.7, you will get TypeError: super() takes at least 1 artument (0 given).

Code

We use the customised sharepy. It get the list of a file in the specified folder and write it into a json file and also download a file specified in the API endpoint url.

Enjoy!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import sharepy_custom as sharepy
import json

# (1) Authenticate
s = sharepy.connect("example.sharepoint.com",\
username=<username>, password=<password>)

# (2) Get file information
r = s.get("https://example.sharepoint.com/GroupSites/HR/_api/web\
/GetFolderByServerRelativeUrl('Shared Documents')/Files"
)
data = r.json()
file = open("sharepoint.json", "w")
file.write(json.dumps(data, indent=4))
print("json file has been generated")

# (3) Download file
r = s.getfile("https:// example.sharepoint.com/GroupSites/HR/_api/web\
/GetFileByServerRelativeUrl('/GroupSites/HR/Shared Documents/Team.xlsx')/$value"
\
, filename = 'team.xlsx')
print("File Downloaded")
Data Engineering
Sending XML Payload and Converting XML Response to JSON with Python

If you need to interact with a REST endpoint that takes a XML string as a payload and returns another XML string as a response, this is the quick guide if you want to use Python. If you want to do it with Node.js, you can check out the post …

Data Engineering
Sending XML Payload and Converting XML Response to JSON with Node.js

Here is the quick Node.js example of interacting with a rest API endpoint that takes XML string as a payload and return with XML string as response. Once we get the response, we will convert it to a JSON object. For this example, we will use the old-school QAS (Quick …

Data Engineering
Downloading All Public GitHub Gist Files

I used to use plug-ins to render code blocks for this blog. Yesterday, I decided to move all the code into GitHub Gist and inject them from there. Using a WordPress plugin to render code blocks can be problematic when update happens. Plugins might not be up to date. It …