How to parse JSON data with Python


Use Python Read and parse JSON Data tutorial

JSON The format is Website and API Common standard format used , Now some mainstream databases ( Such as PostgreSQL) All support JSON Format . In this paper , We will show you how to use Python Handle JSON data .  You can also go to our official Chinese website Oxylabs.cn For more information . First , Let's see JSON The definition of .

What is? JSON?

JSON or JavaScript Object Notation, It's a kind of Use text to store the format of data objects . let me put it another way , It is a kind of data structure , Express the object in text form . Although it comes from JavaScript, But it has become the actual standard for transmission objects .

Most popular programming languages support JSON Format , Include Python.JSON Format files are often used for API Transfer data objects . Here are JSON An example of a string :

"name": "United States",
"population": 331002651,
"capital": "Washington D.C.",
"languages": [

In this case ,JSON The data looks like a Python Dictionaries . Like a dictionary ,JSON Pass data in the form of key value pairs . However ,JSON Data can also be character string 、 Numbers 、 Boolean or list .

stay JSON Before it became popular ,XML It has always been a common choice to represent data objects in text format . Here are XML An example of the same information in the format :

<?xml version="1.0" encoding="UTF-8"?>
<name>United States</name>
<capital>Washington D.C.</capital>

Obviously ,JSON Less code . This is a JSON One of the main reasons for this popularity . If you want to know about JSON More information on Standards , Please visit JSON Official website .

Python Medium JSON

Python Native support JSON data .Python json Modules are part of the standard library . The json The module can JSON Data from JSON Format conversion to equivalent Python object , for example dictionary and list.JSON The module can also put Python Object to JSON Format .

Python Of json The module provides the function of writing custom encoder and decoder , No separate installation required . You can go to This link Found in Pythonjson Official documentation of the module .

Next , We will study this module . We will JSON Convert to dictionary and list. We will also try to deal with custom classes .

take JSON String conversion to Python object

JSON Data is often stored in character string in . This is the use of API Common scenarios when .JSON Data is usually stored in string variables before parsing . therefore , And JSON The most common related task is to JSON The string resolves to Python Dictionaries .JSON The module can easily handle this task .

The first step is to import Python Of json modular . This module contains two important functions -loads and load.

Please note that , The first method looks like the plural form , But that's not the case . Letter “S” representative “ character string ”.

loads Is to parse a string into JSON data . Please note that , It reads “load-s”. there “s” representative “ character string ”.Load The usage scenario of is When data is in bytes . This part will be described in detail later .

Let's start with a simple example .JSON Data examples are as follows :

"name": "United States",
"population": 331002651,

JSON Data can be stored as before parsing JSON character string . Not only can we use it Python Of Three quotes To store multiple lines of string , It can also be done through Remove line breaks To improve readability .

# JSON string
country = '{"name": "United States", "population": 331002651}'

The output of this code snippet will confirm that this is indeed a JSON character string :

<class 'str'>

We can call this json.loads() And take this string as a parameter .

import json
country = '{"name": "United States", "population": 331002651}'
country_dict = json.loads(country)

The output of this code snippet will confirm... As a string JSON The data is now Python Dictionaries .

<class 'str'>
<class 'dict'>

This dictionary can be accessed as usual :

​​​​​​​# OUTPUT: United States

It should be noted that ,json.loads() Methods do not always return dictionaries . The data type returned will Depends on the input string . for example , The following JSON The string will return a list , Not a dictionary .

countries = '["United States", "Canada"]'
counties_list= json.loads(countries)

Again , If JSON The string contains true, It will be converted to Python Equivalent Boolean value , namely True.

import json
bool_string = 'true'
bool_type = json.loads(bool_string)
# OUTPUT: True

The following table shows Converted JSON Objects and Python data type .









number (integer)


number (real)








Next we will move on to the next topic , take JSON The object resolves to Python object .

take JSON The file is converted to Python object

Read JSON file , And will JSON The data is parsed as Python data , Parse with us stored in string JSON The way the data is processed is very similar . except JSON, We also need Python The native function of open().

commonly loads For reading JSON character string , and load() Used to read... In a file JSON data .

load() Method receives a file object and returns a file object that resolves to Python Object's JSON data .

To get the file object from the file path , have access to Python Function of open().

Add the following JSON Save the data as a new file and name it united_states.json

"name": "United States",
"population": 331002651,
"capital": "Washington D.C.",
"languages": [

Enter this... In the new file Python Script :

import json
with open('united_states.json') as f:
data = json.load(f)

Run this Python The file outputs the following :

<class 'dict'>

In this example , The open Function returns a file handle , This handle is provided to load.

Variable data contain JSON, As Python Dictionaries . This means that the dictionary keys can be checked as follows :

# OUTPUT: dict_keys(['name', 'population', 'capital', 'languages'])

Use this information ,name You can output the following :

# OUTPUT: United States

In the first two sections , We studied how to JSON Convert to Python object . Now? , Let's see how to put Python Object to JSON object .

take Python Object to JSON character string

take Python Object to JSON Objects are also called serialization or JSON code . You can use functions dumps() To achieve . It is read as dump-s, Letter S Representative string .

Here is a simple example . Use this code as Python The script is saved in a new file :

import json
languages = ["English","French"]
country = {
"name": "Canada",
"population": 37742154,
"languages": languages,
"president": None,
country_string = json.dumps(country)

Use Python When running this file , The following results will be output :

{"name": "Canada", "population": 37742154, "languages": ["English", "French"],
"president": null}

Python The object is now a JSON Object . This simple example shows that Python The object resolves to JSON Object procedure , The whole process is not complicated . And here Python Object is a Dictionaries . This is how it is converted to JSON The reason for the object type . Again , Lists can also be converted to JSON. This is the corresponding Python Script and its output :

import json
languages = ["English", "French"]
languages_string = json.dumps(languages)
# OUTPUT: ["English", "French"]

It is not limited to dictionaries and lists .string,int,float,bool even to the extent that None Values can be converted to JSON.

For more information , Refer to the conversion table below . You can see , Only the dictionary is converted to json object type . Relevant official documents , see also This link .

















take Python Object write JSON file

Used for compiling JSON The way to file is dump(). This method and dumps() The method is very similar . The only difference is dumps() Returns a string ,dump() Write a file .

Here is a simple demonstration , The file will be opened in edit mode and the data will be written as JSON Format . Save this Python Script and run it .

import json
# Tuple is encoded to JSON array.
languages = ("English", "French")
# Dictionary is encoded to JSON object.
country = {
"name": "Canada",
"population": 37742154,
"languages": languages,
"president": None,
with open('countries_exported.json', 'w') as f:
json.dump(country, f)

Use Python When executing this code ,countries_exported.json Will create ( Or cover ) file , The contents are as follows JSON file .

however , You will find the whole JSON All in one line . To make it more readable , We can pass another parameter to dump() function , As shown below :

json.dump(country, f, indent=4)

This time, , When you run code , The format is normal , It also indents 4 A space :

"languages": [
"president": null,
"name": "Canada",
"population": 37742154

Be careful ,indent Parameters can also be used for JSONdumps() Method .JSONdump() and JSONdumps() The only difference is dump() Need a file object .

Will customize Python Object to JSON object

Let's check dump() Method signature :

dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True,allow_nan=True, cls=None, indent=None, separators=None,default=None, sort_keys=False, **kw)

Focus on parameters cls.

If you're calling dump There is no Class, be dump() and dumps() Methods will default to JSONEncoder This kind . This class supports standard Python Type a :dict,list,tuple,str,int,float,True,False, and None.

If we try json.loads() Call this method on any other type , This method will trigger TypeError Error message :Object of typeis not JSON serializable.

Save the following code as Python Script and run :

import json
class Country:
def __init__(self, name, population, languages):
self.name = name
self.population = population
self.languages = languages
canada = Country("Canada", 37742154, ["English", "French"])
# OUTPUT: TypeError: Object of type Country is not JSON serializable

To convert an object to JSON, We need to write an extension JSONEncoder A new class of . In this class , Need to achieve default(). This method will have a return JSON Custom code for .

Here are Country Class . This class will help you Python Object to JSON object :

import json
class CountryEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, Country):
# JSON object would be a dictionary.
return {
"name" : o.name,
"population": o.population,
"languages": o.languages
# Base class will raise the TypeError.
return super().default(o)

This code confirms that the provided object is Country Class will return a dictionary , Or call the parent to handle the rest .

This class can be provided to json.dump() and json.dumps() Method .

print(json.dumps(canada, cls=CountryEncoder))
# OUTPUT: {“name": "Canada", "population": 37742154, "languages": ["English", "French"]}

from JSON objects creating Python Class object

up to now , We have discussed how to use json.load() and json.loads() Method to create a dictionary 、 List etc. . If we want to read JSON Object and create a custom class object ?

In this section , We will Create a custom JSON decoder , Help us create custom objects . This custom decoder will allow us to use json.load() and json.loads() Method , And return a custom class object .

We will use the Country class . Use custom encoder , We can write the following code :

# Create an object of class Country
canada = Country("Canada", 37742154, ["English", "French"])
# Use json.dump() to create a JSON file in writing mode
with open('canada.json','w') as f:
json.dump(canada,f, cls=CountryEncoder)

If we try to use json.load() Method to parse this JSON file , We will get a dictionary :

with open('canada.json','r') as f:
country_object = json.load(f)
# OUTPUT: <type ‘dict'="">

If you want to get Country Class instead of a dictionary , We need to create one Custom decoder . This decoder class will Expand JSONDecoder. In this class , We're going to write object_hook. This allows you to read values from the dictionary to create Country Class object .

In addition to writing this , We also need to call __init__ Base class And the parameter object_hook Set the value of to the name of the method . For the sake of simplicity , We can use the same name .

import json
class CountryDecoder(json.JSONDecoder):
def __init__(self, object_hook=None,
*args, **kwargs):
_hook, *args, **kwargs)
def object_hook(self, o):
decoded_country = Country(
return decoded_country

Be careful , We will use .get() Method to read dictionary keys . This will ensure that an error is not raised when a key is missing from the dictionary .

Last , We can call json.load() Methods and cls Parameter set to CountryDecoder class .

with open('canada.json','r') as f:
country_object = json.load(f, cls=CountryDecoder)
# OUTPUT: <class ‘country'="">

Get it done ! We now have a direct from JSON Created a custom object .

Loading And dumping contrast

Python Of JSON The module has four main functions :read(),reads(),load(), and loads(). These features are often confusing . The most important thing is Letter “s” representative String. in addition , In function loads() and dumps() Letters in “s” need Read separately , namely loads read load-s,dumps() read dump-s.

This is a quick form , Can help you remember these functions :











In this tutorial , We learned to use Python Read and write JSON data . Especially when dealing with websites , Learn how to handle JSON Data is crucial .JSON It is used to transmit and store data in many places , Include API、 Web crawlers and modern databases ( Such as PostgreSQL).

If you are working on a web crawl project involving dynamic websites , So understanding JSON crucial . You can read our article , understand JSON Application of instance in infinite scrolling page .

