A Comprehensive Guide for Reading and Writing JSON with Python

A Comprehensive Guide for Reading and Writing JSON with Python

The json module enables you to read JSON object from a file or HTTP response and write it to a file. It is worthwhile to spend a little bit of time to understand a few key functions that are often used for ingesting json data.

In fact, if you understand these three functions below, you are pretty much set for any JSON data ingestion.

  • load()
  • loads()
  • dumps()

Before getting into actual code, let’s see what each function does.

json.load()

The load() function read a json file and returns a json object. In Python, it is a dictionary. Technically speaking, it deserialises a json object into a Python object, dictionary, by using the conversion table.

You first need to read the json file. If you convert it to a string and write it to a file, the file contains the string version of dictionary.

1
2
3
4
5
6
7
8
9
10
11
12
import json
# Loading File
f =open('/tmp/json_pretty_test.json', 'r')
json_obj = json.load(f)
# or
with open('/tmp/json_pretty_test.json', 'r') as handle:
    json_obj = json.load(handle)

print(type(json_obj))

for line in json_obj:
    print(line)

As you can see in the output, it is dictionary, not JSON.

json.loads()

‘s’ means string. It takes string as an input instead of file object. Type comes out as dictionary. Just like the load() function, it deserialises a json string into a Python object, dictionary.

1
2
3
4
string = '{"id":1123,"name":"John"}'
json_obj = json.loads(string)
print(json_obj)
print(type(json_obj))

json.dumps()

The dumps() function returns a string representing a json object from Python dictionary object. Once you load a json file with json.load(), you can pass the resulting object to json.dumps() to get the string representation of json, which is can be written to a json file.

This is the opposite of load or loads. It serialise the dictionary into json string by using the conversion table. The conversion table maps Python object to json object. For example, None becomes null, True becomes true, single quote becomes double quote and so on.

Note that json.dumps() does not work on entire json_obj. You need to apply it to each dictionary in the for loop.

1
2
3
4
5
6
7
with open('/tmp/json_pretty_test.json', 'r') as handle:
    json_obj = json.load(handle)

for line in json_obj:
    dumped = json.dumps(line)
    print(type(dumped))
    print(dumped)

As you can see in the output, the data type is string. Python data structure for dictionary is converted to json data structure (double quote instead of single quote etc).

You can pretty print Json by adding the indent argument to json.dumps().

1
json.dumps(line, indent=4)

Code Examples

Now that we got the basics, let’s have a look at the usage of these functions in the context of REST API data ingestion. We are using JSONPlaceholder which provides free API endpoints for example JSON data. The site is great for testing or experimenting with JSON over REST API. To make API calls, we are using the requests module.

(1) Using json.loads() and json.dumps()

The requests module has json() function which deserialise the JSON object in the response into a dictionary. The json_ph_api() returns JSON data in the dictionary format.

write_json() uses json.dumps() to serialise each dictionary object returned from json_ph_api() into a json string and write it to a file.

Write_pretty_json() creates a pretty json file by using the indent argument in json.dumps().

json_file_check() prints the first 3 lines of the output from json_ph_api().

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import json
import requests

def json_ph_api(resource):
    '''Get example json data from jsonplaceholder.typicode.com.
    Returns Python dictionary object'''

    json_data = None
    endpoint = 'https://jsonplaceholder.typicode.com/{}'.format(resource)
    print('Target endpoint is {}'.format(endpoint))
    r = requests.get(endpoint)
    if r.status_code == 200:
        print('API Call Successful')
        # data returned with r.json() is a dictionary object
        json_data = r.json()
    else:
        print('API call failed with status code: {}'.format(r.status_code))
    return json_data

def json_file_check(json_object):
    '''This function prints first 3 records
    of the input json object'''

    counter = 0
    for i in json_object:
        if counter < 3:
            print(i)
            counter += 1
        else:
            break

def write_json(json_obj, file_path):
    '''Write Json object to a file'''
    f = open(file_path, 'w', encoding='utf-8')
    for line in json_obj:
        # json_obj is stored as dictionaries.
        # json.dumps convert dictionary to json
        f.write(json.dumps(line))
        f.write('\n')
    print('Json file created as {}'.format(file_path))
    f.close()

def write_pretty_json(json_obj, file_path):
    '''Write Json object in a pretty format'''
    f = open(file_path, 'w')
    f.write(json.dumps(json_obj, indent=4))
    print('Json file with pretty format created as {}'.format(file_path))


j_obj = json_ph_api('posts')

write_json(j_obj, '/tmp/json_test.json')
write_pretty_json(j_obj, '/tmp/json_pretty_test.json')

(2) Reading and writing JSON file with json.load()

We can read the json file created above and write it to another json file with write_json(). The file object needs to be passed into json.load() before writing.

1
2
3
4
5
with open('/tmp/json_pretty_test.json', 'r') as handle:
    j = json.load(handle)

json_file_check(j)
write_json(j, '/tmp/json_test2.json')

(3) Example of json.load()

The response is converted to a String object with .text in this example. The string object needs to be converted into a dictionary with json.load() and printed in the console with json_file_check().

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def json_ph_api_str(resource):
    '''Get example json data from jsonplaceholder.typicode.com.
    Returns response as String'''

    str_data = None
    endpoint = 'https://jsonplaceholder.typicode.com/{}'.format(resource)
    print('Target endpoint is {}'.format(endpoint))
    r = requests.get(endpoint)
    if r.status_code == 200:
        print('API Call Successful')
        # r.json() is r dictionaries. Use json.dumps to convert dict to json data type.
        str_data = r.text
    else:
        print('API call failed with status code: {}'.format(r.status_code))
    return str_data

json_string = json_ph_api_str('posts')
# json.load does not work here.
j2 = json.loads(json_string)
json_file_check(j2)
Data Engineering
Sending XML Payload and Converting XML Response to JSON with Python

If you need to interact with a REST endpoint that takes a XML string as a payload and returns another XML string as a response, this is the quick guide if you want to use Python. If you want to do it with Node.js, you can check out the post …

Data Engineering
Sending XML Payload and Converting XML Response to JSON with Node.js

Here is the quick Node.js example of interacting with a rest API endpoint that takes XML string as a payload and return with XML string as response. Once we get the response, we will convert it to a JSON object. For this example, we will use the old-school QAS (Quick …

Data Engineering
Downloading All Public GitHub Gist Files

I used to use plug-ins to render code blocks for this blog. Yesterday, I decided to move all the code into GitHub Gist and inject them from there. Using a WordPress plugin to render code blocks can be problematic when update happens. Plugins might not be up to date. It …