Working with JSON data in Python

  • JSON (JavaScript Object Notation) is a lightweight data-interchange format.
  • JSON is built on two structures
    • A collection of name/value pairs (eg., object, record, struct, dictionary, hash table, keyed list, or associative array)
    • An ordered list of values (eg., array, vector, list, or sequence)
  • An object is an unordered set of name/value pairs. An object begins with {left brace and ends with }right brace. Each name is followed by :colon and the name/value pairs are separated by ,comma.
  • An array is an ordered collection of values. An array begins with [left bracket and ends with ]right bracket. Values are separated by ,comma.
  • A value can be a string in double quotes, or a number, or true or false or null, or an object or an array.

Converting Python Data to JSON Format (Serialization)

In Python we have default json module to work with Json data.

To convert Python Data into JSON format use dump or dumps methods

#Dictionary Data
>>> myCourseDict = { "Name": "Python", "Duration": 35, "Trainer": "Girish", "Started":True, "Start Date": "2019-10-24", "End Date": "2019-11-15", "Recomemded Courses": ["Data Science","Big Data Hadoop with Spark"]}
>>> print(myCourseDict)
{'Name': 'Python', 'Duration': 35, 'Trainer': 'Girish', 'Started': True, 'Start Date': '2019-10-24', 'End Date': '2019-11-15', 'Recomemded Courses': ['Data Science', 'Big Data Hadoop with Spark']}

#Converting Dictionary Data to JSON Format
>>> import json
>>> myJson = json.dumps(myCourseDict)
>>> print(myJson)
{"Name": "Python", "Duration": 35, "Trainer": "Girish", "Started": true, "Start Date": "2019-10-24", "End Date": "2019-11-15", "Recomemded Courses": ["Data Science", "Big Data Hadoop with Spark"]}

#List Data
>>> recomendedCourses = ["Data Science","Big Data Hadoop with Spark"]

#Converting List Data to JSON Format
>>> myListJson = json.dumps(recomendedCourses)
>>> print(myListJson)
["Data Science", "Big Data Hadoop with Spark"]

Python and JSON equivalents

>>> myDict = {"Name": "Python", "Duration": 45, "Trainer":"Girish"}
>>> json.dumps(myDict)
'{"Name": "Python", "Duration": 45, "Trainer": "Girish"}'

>>> myList = ["Sun","Mon","Tue","Wed","Thu","Fri","Sat"]
>>> json.dumps(myList)
'["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]'
 
>>> myTuple = ("Sun","Mon","Tue","Wed","Thu","Fri","Sat")
>>> json.dumps(myTuple)
'["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]'

>>> str = "Girish"
>>> json.dumps(str)
'"Girish"'

>>> intr = 25
>>> json.dumps(intr)
'25'

>>> flt = 28.5
>>> json.dumps(flt)
'28.5'

>>> bln = True
>>> json.dumps(bln)
'true'

>>> bln = False
>>> json.dumps(bln)
'false'

>>> mydata = None
>>> json.dumps(mydata)
'null'

Formatting JSON data Output (Pretty printing)

indent parameter is used to define the numbers of indents to format the output of Json data.

>>> myJson = json.dumps(myCourseDict, indent=5)
>>> print(myJson)
{
     "Name": "Python",
     "Duration": 35,
     "Trainer": "Girish",
     "Started": true,
     "Start Date": "2019-10-24",
     "End Date": "2019-11-15",
     "Recomemded Courses": [
          "Data Science",
          "Big Data Hadoop with Spark"
     ]
}

Sorting the Json data using sort_keys parameter

>>> myJson = json.dumps(myCourseDict, indent=5, sort_keys=True)
>>> print(myJson)
{
     "Duration": 35,
     "End Date": "2019-11-15",
     "Name": "Python",
     "Recomemded Courses": [
          "Data Science",
          "Big Data Hadoop with Spark"
     ],
     "Start Date": "2019-10-24",
     "Started": true,
     "Trainer": "Girish"
}

Using dump to write Python Data into a Json file format

>>> myDict = {
...      "Name": "Python",
...      "Duration": 35,
...      "Trainer": "Girish",
...      "Started": True,
...      "Start Date": "2019-10-24",
...      "End Date": "2019-11-15",
...      "Recomemded Courses": [
...           "Data Science",
...           "Big Data Hadoop with Spark"
...      ]
... }

>>> import json
>>> with open("myDict_file.json","w") as write_file:
...     json.dump(myDict, write_file)
...

myDict_file.json

{"Name": "Python", "Duration": 35, "Trainer": "Girish", "Started": true, "Start Date": "2019-10-24", "End Date": "2019-11-15", "Recomemded Courses": ["Data Science", "Big Data Hadoop with Spark"]}

Parsing JSON Data (De-Serialization)

To deserialize JSON data into Python objects, we can use load or loads methods.

#JSON Data
{
     "Duration": 35,
     "End Date": "2019-11-15",
     "Name": "Python",
     "Recomemded Courses": [
          "Data Science",
          "Big Data Hadoop with Spark"
     ],
     "Start Date": "2019-10-24",
     "Started": true,
     "Trainer": "Girish"
}

#JSON to Python Data using loads() method

>>> myPythonData = json.loads(myJson)
>>> myPythonData
{'Duration': 35, 'End Date': '2019-11-15', 'Name': 'Python', 'Recomemded Courses': ['Data Science', 'Big Data Hadoop with Spark'], 'Start Date': '2019-10-24', 'Started': True, 'Trainer': 'Girish'}

>>> myJsonList
'["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]'

>>> myList = json.loads(myJsonList)
>>> myList
['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

>>> myJsonTuple
'["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]'

>>> myTuple = json.loads(myJsonTuple)
>>> myTuple
['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']

Using load method to convert json file data into python data

myDict_file.json

{"Name": "Python", "Duration": 35, "Trainer": "Girish", "Started": true, "Start Date": "2019-10-24", "End Date": "2019-11-15", "Recomemded Courses": ["Data Science", "Big Data Hadoop with Spark"]}
>>> myPythonData = ""
>>> with open("myDict_file.json","r") as read_file:
...     myPythonData = json.load(read_file)
...
>>> myPythonData
{'Name': 'Python', 'Duration': 35, 'Trainer': 'Girish', 'Started': True, 'Start Date': '2019-10-24', 'End Date': '2019-11-15', 'Recomemded Courses': ['Data Science', 'Big Data Hadoop with Spark']}

References

  • https://json.org/
  • https://docs.python.org/2/library/json.html
  • https://docs.python.org/3/library/json.html

Learn more about Python Features in the upcoming blog articles.

Happy Learning!