Sunteți pe pagina 1din 7

21/10/2014

Why JSON in PostgreSQL is Awesome

HOME

SUBSCRIBE

Why JSON in
PostgreSQL is
Awesome
10 NOVEMBER 2013 on PostgreSQL, JSON

Recently a commenter on Hacker News questioned the


usefulness of the JSON data type in PostgreSQL. At the
time I posted a brief response, but I have decided to take
the time to write up a more detailed explanation here as
I believe the JSON data type provides a significant
benefit to developers.
Too long; didnt read: The JSON data type is useful for
storing multi-level object graphs. It provides better
performance and the code itself is easier (and therefore
cheaper) to write and maintain. The developer will be
happier and the development will be cheaper.

Multi-Level Object Graph: Names


To demonstrate the difficulties of handling multi-level
object graphs in relational databases, consider names as an
example.
Handling names in computer systems correctly is a
https://functionwhatwhat.com/json-in-postgresql/

1/7

21/10/2014

Why JSON in PostgreSQL is Awesome

difficult task. Programmers tend to believe falsehoods


about names. They often assume that everybody has a first
name and a last name (and in that order), but, realistically,
one cannot make such brosumptions about peoples
names. Different cultures have different naming
conventions. Over the years I have come to the conclusion
that the best way to handle names while preserving
practicality is to:
1. store the full name,
2. keep an ordered list of name typename pairs.
For example:

{
"fullName": "Charles John Huffam Dickens",
"names":
[
{ "type": "firstName", "value": "Charles" },
{ "type": "middleName", "value": "John" },
{ "type": "middleName", "value": "Huffam" },
{ "type": "lastName", "value": "Dickens" }
]
}

Book Store
Imagine an application that manages books, such as that
used by a library or a book store. Books have authors,
authors have names and these names have name parts
(e.g. first name, last name, etc.). The application must
https://functionwhatwhat.com/json-in-postgresql/

2/7

21/10/2014

Why JSON in PostgreSQL is Awesome

keep track all of this data. If one follows the requirements


of the third normal form, this will result in database tables
such as the ones below.
Note: for the sake of simplicity, well ignore the fact that the
example below wouldnt meet the requirements of the third
normal form if one person could author multiple titles.
Book Table
id

title

A Christmas Carol
Author Table

id

book_id

full_name

101

Charles John Huffam Dickens


Author Name Part Table

id

author_id

type

value

201

101

first_name

Charles

202

101

middle_name

John

203

101

middle_name

Huffam

204

101

last_name

Dickens

To fetch a book from the database there are 3 options. An


ORM will probably attempt option 2.
1. Everything is joined in a single query. This will yield a
row for every name part within the name of each
author. The columns derived from the book table will
https://functionwhatwhat.com/json-in-postgresql/

3/7

21/10/2014

Why JSON in PostgreSQL is Awesome

have the same data repeated in every row. The book


data structure must be assembled from multiple rows.
Things will get very convoluted if one attempts to
retrieve multiple books at the same time as every row
will need to be carefully matched to the correct author
and book.
2. A separate query is run for each table. First the book is
returned, then the authors, then the name parts. To
retrieve multiple books, either
the author and name part queries must be repeated
for each book (and the name part query for each
author in each book) OR
the author and name part queries must be
constructed in a manner that retrieves all of the
authors of all the books (and all the name parts of
all the authors of the books) in a single result at
which point the matching exercise described in the
previous section must be executed.
The first variant results in a separate query for each
piece of data, which will have a significantly
detrimental impact on performance. While the second
variant will be much faster, it is still not a very elegant
solution as it requires the database to repeat the
(possibly expensive) filter that was used to select the
books 3 times: once for the books, once for the authors
and once for the name parts themselves.
3. There is no third option! Everybody loses. Always. This
is why objectrelational mapping is hard.
https://functionwhatwhat.com/json-in-postgresql/

4/7

21/10/2014

Why JSON in PostgreSQL is Awesome

Serialisation to JSON
Fortunately, it is possible to serialise the entirety of this
madness into a single JSON string (or XML or something
similar but this post is about JSON). The serialised object
can then be stored in a text column. When reading from
the database, the text can be deserialised back to an object
graph. While this was possible with earlier versions of the
software, PostgreSQL 9.2 and 9.3 have introduced a
number of features that make working with JSON data a
very pleasant experience:
The json data type is basically the same as text in
terms of behaviour and storage, but the database
checks that the value is valid JSON. This was introduced
in PostgreSQL 9.2.
With JSON functions and operators introduced in
PostgreSQL 9.3 one can select authors->1->'fullName'
from book to get the full name of the first author of every

book. It is even possible to create indexes that traverse


JSON values with indexes on expressions.
Without JSON columns, functions and operators, such
queries become complex and possibly inefficient. With
JSON, only a single table is required:
Book Table
i title
d
1

A Christmas
Carol

authors

[
{
"fullName": "Charles John Huffam

https://functionwhatwhat.com/json-in-postgresql/

5/7

21/10/2014

Why JSON in PostgreSQL is Awesome

Dickens",
"names":
[
{ "type":
"Charles" },
{ "type":
"John" },
{ "type":
"Huffam" },
{ "type":
"Dickens" }
]
}
]

"firstName", "value":
"middleName", "value":
"middleName", "value":
"lastName", "value":

Less code = faster queries. Its hard not to be excited about


this.

The Downsides
No loops in the object graph. Circular references are not
permitted in JSON. An author cannot belong to two books
and authors cannot reference each other. The first problem
can be tackled by repeating an authors data for each book
he/she has authored. This means, however, that the
database will be larger and updating (renaming) authors a
very expensive operation. It is up to the programmer to
decide if this is compatible with the requirements of the
application.
No referential integrity. While it is possible to generate
pseudo unique keys by creating a unique index on a JSON
expression, foreign keys cannot be created on (or
reference) JSON expressions. For a field to reference
another column or be referenced by another column, it
https://functionwhatwhat.com/json-in-postgresql/

6/7

21/10/2014

Why JSON in PostgreSQL is Awesome

needs to be promoted into an actual column or perhaps


even a table.
Not portable. This is a non-standard extension to SQL
specific to PostgreSQL only. It cannot be used with other
databases even if they have some JSON support as the
actual implementation is probably very different. This
means that the application will be locked in to PostgreSQL.
In my view, there are worse things than being locked in to
an open source technology. However again, it is up to the
judgement of the stakeholders to determine whether this
meets the needs of the business.
Follow me on Twitter: @tamasczinege

Tamas Czinege

Share this post

Founder of 44/55 Software Limited, London.

London, UK
https://twitter.com/tamasczinege

Function What What 2014

https://functionwhatwhat.com/json-in-postgresql/

Proudly published with Ghost

7/7

S-ar putea să vă placă și