您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python gracefully dumps nonstandard types

編輯：Python

stay Python One of the things you do very often is Python Data types and JSON Conversion of data types .

But there is an obvious problem ,JSON As a data exchange format, there are fixed data types , however Python As a programming language, in addition to the built-in data types, you can also write custom data types .

For example, you must have encountered similar problems :

>>> import json
>>> import decimal
>>>
>>> data = {'key1': 'string', 'key2': 10, 'key3': decimal.Decimal('1.45')}
>>> json.dumps(data)
Traceback (most recent call last):
File "<input>", line 1, in <module>
json.dumps(data)
File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.6/json/encoder.py", line 180, in default
o.__class__.__name__)
TypeError: Object of type 'Decimal' is not JSON serializable Copy code

So here's the problem , How to put all kinds of Python The data type is transformed into JSON data type . A very pythonic The way to do it is , First convert into some kind of energy and JSON Data type directly converted values , And then in dump, It's very direct and violent , But it is weak in front of various fancy data types .

Google Is one of the important ways to solve problems , When you have a search , You will find that you can actually dumps when encode At this stage, the data is transformed .

So you must have done that , Solved the problem perfectly .

>>> class DecimalEncoder(json.JSONEncoder):
... def default(self, obj):
... if isinstance(obj, decimal.Decimal):
... return float(obj)
... return super(DecimalEncoder, self).default(obj)
...
...
>>>
>>> json.dumps(data, cls=DecimalEncoder)
'{"key1": "string", "key2": 10, "key3": 1.45}' Copy code

JSON Of Encode The process

The code in the text is extracted from github.com/python/cpyt… Deleted almost all docstring, Because the code is too long , Directly intercepted important fragments . You can view the complete code in the top link of the fragment .

be familiar with json Everyone in this library knows only 4 A commonly used API, Namely dump、dumps and load、loads.

The source code is located in cpython/Lib/json in

# https://github.com/python/cpython/blob/master/Lib/json/__init__.py#L183-L238
def dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, cls=None, indent=None, separators=None,
default=None, sort_keys=False, **kw):
# cached encoder
if (not skipkeys and ensure_ascii and
check_circular and allow_nan and
cls is None and indent is None and separators is None and
default is None and not sort_keys and not kw):
return _default_encoder.encode(obj)
if cls is None:
cls = JSONEncoder
# a key
return cls(
skipkeys=skipkeys, ensure_ascii=ensure_ascii,
check_circular=check_circular, allow_nan=allow_nan, indent=indent,
separators=separators, default=default, sort_keys=sort_keys,
**kw).encode(obj) Copy code

Directly see the last return. It can be found that if no cls Use by default JSONEncoder, Then call the instance method of the class encode.

encode The method is also very simple ：

# https://github.com/python/cpython/blob/191e993365ac3206f46132dcf46236471ec54bfa/Lib/json/encoder.py#L182-L202
def encode(self, o):
# str Type direct encode After the return
if isinstance(o, str):
if self.ensure_ascii:
return encode_basestring_ascii(o)
else:
return encode_basestring(o)
# chunks Are the parts of the data
chunks = self.iterencode(o, _one_shot=True)
if not isinstance(chunks, (list, tuple)):
chunks = list(chunks)
return ''.join(chunks) Copy code

We can see that in the end we get JSON All are chunks It's stitched together ,chunks Is to call self.iterencode Method derived .

# https://github.com/python/cpython/blob/191e993365ac3206f46132dcf46236471ec54bfa/Lib/json/encoder.py#L204-257
if (_one_shot and c_make_encoder is not None
and self.indent is None):
_iterencode = c_make_encoder(
markers, self.default, _encoder, self.indent,
self.key_separator, self.item_separator, self.sort_keys,
self.skipkeys, self.allow_nan)
else:
_iterencode = _make_iterencode(
markers, self.default, _encoder, self.indent, floatstr,
self.key_separator, self.item_separator, self.sort_keys,
self.skipkeys, _one_shot)
return _iterencode(o, 0) Copy code

iterencode The method is longer , We only care about the last few lines .

Return value _iterencode, Is in the function c_make_encoder perhaps _make_iterencode The return values of these two higher-order functions .

c_make_encoder Is from _json This module , This module It's a c modular , We don't care how this module is implemented . Turn to the study of equivalent _make_iterencode Method .

# https://github.com/python/cpython/blob/191e993365ac3206f46132dcf46236471ec54bfa/Lib/json/encoder.py#L259-441
def _iterencode(o, _current_indent_level):
if isinstance(o, str):
yield _encoder(o)
elif o is None:
yield 'null'
elif o is True:
yield 'true'
elif o is False:
yield 'false'
elif isinstance(o, int):
# see comment for int/float in _make_iterencode
yield _intstr(o)
elif isinstance(o, float):
# see comment for int/float in _make_iterencode
yield _floatstr(o)
elif isinstance(o, (list, tuple)):
yield from _iterencode_list(o, _current_indent_level)
elif isinstance(o, dict):
yield from _iterencode_dict(o, _current_indent_level)
else:
if markers is not None:
markerid = id(o)
if markerid in markers:
raise ValueError("Circular reference detected")
markers[markerid] = o
o = _default(o)
yield from _iterencode(o, _current_indent_level)
if markers is not None:
del markers[markerid]
return _iterencode Copy code

The only thing you need to care about is the return function , All kinds of... In the code if-elif-else Convert the built-in types to one by one JSON type . It is used when the type cannot be recognized _default() This method , Then recursively call to parse each value .

_default It's the one covered in the front default.

Here you can fully understand Python How is it? encode become JSON data .

Summarize the process ,json.dumps() call JSONEncoder Instance method of encode(), Subsequent use iterencode() Recursively convert various types , Finally, put chunks Concatenate into a string and return .

Elegant solution

After the previous process analysis , Know why to inherit JSONEncoder Then cover default Method to complete custom type resolution .

Maybe you need to analyze it later datetime Type data , You will certainly do that ：

class ExtendJSONEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, decimal.Decimal):
return int(obj)
if isinstance(obj, datetime.datetime):
return obj.strftime(DATETIME_FORMAT)
return super(ExtendJSONEncoder, self).default(obj) Copy code

The last call to the parent class is default() Methods are purely intended to trigger exceptions .

Python have access to singledispatch To solve this single generic problem .

import json
from datetime import datetime
from decimal import Decimal
from functools import singledispatch
class MyClass:
def __init__(self, value):
self._value = value
def get_value(self):
return self._value
# Create three instances of non built-in types
mc = MyClass('i am class MyClass ')
dm = Decimal('11.11')
dt = datetime.now()
@singledispatch
def convert(o):
raise TypeError('can not convert type')
@convert.register(datetime)
def _(o):
return o.strftime('%b %d %Y %H:%M:%S')
@convert.register(Decimal)
def _(o):
return float(o)
@convert.register(MyClass)
def _(o):
return o.get_value()
class ExtendJSONEncoder(json.JSONEncoder):
def default(self, obj):
try:
return convert(obj)
except TypeError:
return super(ExtendJSONEncoder, self).default(obj)
data = {
'mc': mc,
'dm': dm,
'dt': dt
}
json.dumps(data, cls=ExtendJSONEncoder)
# {"mc": "i am class MyClass ", "dm": 11.11, "dt": "Nov 10 2017 17:31:25"} Copy code

This way of writing is more in line with the specifications of design patterns . If there is a new type in the future , There's no need to modify ExtendJSONEncoder class , Just add the appropriate singledispatch The method is ok , Compare pythonic .