The Python standard library is full of underappreciated gems. One of them allows for simple and elegant function dispatching based on argument types. This makes it perfect for serialization of arbitrary objects – for example to JSON in web APIs and structured logs.
We’ve all seen it:
TypeError: datetime.datetime(...) is not JSON serializable
To avoid that error, the json
module offers two ways to serialize arbitrary objects that it inherited from simplejson:
- Implement a
default()
function that takes an object and returns something thatJSONEncoder
understands. - Implement or subclass a
JSONEncoder
yourself and pass it ascls
to the dump methods. You can implement it on your own, or subclassJSONEncoder
and override theJSONEncoder.default()
method only.
And since alternative implementations want to be drop-in, they imitate the json
module’s API to various degrees1.
Expandability
What both approaches have in common is that they’re not expandable: adding support for new types is not provided for. Your single default()
fallback has to know about all custom types you want to serialize. Which means you either write functions like:
def to_serializable(val):
if isinstance(val, datetime):
return val.isoformat() + "Z"
elif isinstance(val, enum.Enum):
return val.value
elif attr.has(val.__class__):
return attr.asdict(val)
elif isinstance(val, Exception):
return {
"error": val.__class__.__name__,
"args": val.args,
}
return str(val)
Which is painful since you have to add serialization for all objects in one place2.
Alternatively you can try to come up with general solutions on your own like Pyramid’s JSON renderer did in JSON.add_adapter
which uses the underappreciated zope.interface
’s adapter registry.
Django on the other hand satisfies itself with a DjangoJSONEncoder
that is a subclass of json.JSONEncoder
and knows how to encode dates, times, timedeltas, Decimals, UUIDs, and promises. But beyond that, you’re on your own again. If you want to go further with Django and web APIs, you’re probably already using the Django REST framework anyway. They came up with a whole serialization system that does a lot more than just making data json.dumps()
-ready.
Finally, for the sake of completeness I have to mention my own solution in structlog
that I fiercely hated from day one: adding a __structlog__
method to your classes that return a serializable representation in the tradition of __str__
. It mingles concerns that should know nothing about each other. Please don’t repeat my mistake.
Given how prevalent JSON is, it’s surprising that we have only siloed solutions so far. What I would like to have, is a way to register serializers in a central place but in a decentralized fashion that doesn’t require any changes to my (or worse: third-party) classes.
Enter PEP 443: Single-dispatch generic functions
Turns out, Python 3.4 came with a nice solution to this problem in the form of PEP 443: functools.singledispatch
.
It allows you to define a default function and then register additional versions of that functions depending on the type of the first argument:
from datetime import datetime
from functools import singledispatch
@singledispatch
def to_serializable(val):
"""Used by default."""
return str(val)
@to_serializable.register(datetime)
def ts_datetime(val):
"""Used if *val* is an instance of datetime."""
return val.isoformat() + "Z"
Now you can call to_serializable()
on datetime
instances too and single dispatch will pick the correct function:
>>> json.dumps({"msg": "hi", "ts": datetime.now()},
... default=to_serializable)
'{"ts": "2016-08-20T13:08:59.153864Z", "msg": "hi"}'
This gives you the power to put your serializers wherever you want: along with the classes, in a separate module, or along with JSON-related code? You choose! But your classes stay clean and you don’t have a huge if-elif-else
branch that you copy-paste between your projects.
…and PEP 484 For Good Measure
Telling @singledispatch
about types is good. But if you’re using type annotations, it gets repetitive:
@to_serializable.register(datetime)
def ts_datetime(val: datetime):
"""Used if *val* is an instance of datetime."""
return val.isoformat() + "Z"
Fortunately as of Python 3.7, you don’t have to, because
@to_serializable.register
def ts_datetime(val: datetime):
"""Used if *val* is an instance of datetime."""
return val.isoformat() + "Z"
will work as expected! This is a great use for type annotations and I hope the Python community will come up with more over time.
Going Further
Obviously, the utility of @singledispatch
goes far beyond JSON. Binding different behaviors to different types in general and object serialization in particular are universally useful3! Some of my proofreaders mentioned they tried an approximation using dict
s of classes to callables and other similar atrocities.
In other words, @singledispatch
may be the function that you’ve been missing, although it was there all along.
However, from the popular ones: UltraJSON doesn’t support custom object serialization at all and python-rapidjson and orjson only support the
default()
function. ↩︎Although as you can see it’s manageable with
attrs
; maybe you should useattrs
! ↩︎I’ve been told the original incentive for adding single dispatch to the standard library was a more elegant reimplementation of
pprint
(that never happened). ↩︎