Skip to content

A fast BSON to MongoDB Extended JSON converter for Python - This Repository is NOT a supported MongoDB product

License

Notifications You must be signed in to change notification settings

mongodb-labs/python-bsonjs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

2c42e4b · Sep 9, 2024

History

51 Commits
Sep 5, 2024
Sep 5, 2024
Dec 20, 2021
Sep 5, 2024
Sep 5, 2024
Jul 6, 2016
Jan 26, 2024
Sep 5, 2024
Jul 6, 2016
Jan 6, 2022
Apr 19, 2017
Sep 9, 2024
Jan 26, 2024
Sep 5, 2024

Repository files navigation

python-bsonjs

Info:See github for the latest source.
Author: Shane Harvey <[email protected]>

About

A fast BSON to MongoDB Extended JSON converter for Python that uses libbson.

Installation

python-bsonjs can be installed with pip:

$ python -m pip install python-bsonjs

Examples

>>> import bsonjs
>>> bson_bytes = bsonjs.loads('{"hello": "world"}')
>>> bson_bytes
'\x16\x00\x00\x00\x02hello\x00\x06\x00\x00\x00world\x00\x00'
>>> bsonjs.dumps(bson_bytes)
'{ "hello" : "world" }'

Using bsonjs with pymongo to insert a RawBSONDocument.

>>> import bsonjs
>>> from pymongo import MongoClient
>>> from bson.raw_bson import RawBSONDocument
>>> client = MongoClient("localhost", 27017, document_class=RawBSONDocument)
>>> db = client.test
>>> bson_bytes = bsonjs.loads('{"_id": 1, "x": 2}')
>>> bson_bytes
'\x15\x00\x00\x00\x10_id\x00\x01\x00\x00\x00\x10x\x00\x02\x00\x00\x00\x00'
>>> result = db.test.insert_one(RawBSONDocument(bson_bytes))
>>> result.inserted_id  # NOTE: inserted_id is None
>>> result.acknowledged
True
>>> raw_doc = db.test.find_one({'x': 2})
>>> raw_doc.raw == bson_bytes
True
>>> bsonjs.dumps(raw_doc.raw)
'{ "_id" : 1, "x" : 2 }'

Speed

bsonjs is roughly 10-15x faster than PyMongo's json_util at decoding BSON to JSON and encoding JSON to BSON. See benchmark.py:

$ python benchmark.py
Timing: bsonjs.dumps(b)
10000 loops, best of 3: 0.110911846161
Timing: json_util.dumps(bson.BSON(b).decode())
10000 loops, best of 3: 1.46571397781
bsonjs is 13.22x faster than json_util

Timing: bsonjs.loads(j)
10000 loops, best of 3: 0.0628039836884
Timing: bson.BSON().encode(json_util.loads(j))
10000 loops, best of 3: 0.683200120926
bsonjs is 11.72x faster than json_util

Limitations

Top Level Arrays

Because libbson does not distinguish between top level arrays and top level documents, neither does python-bsonjs. This means that if you give dumps or dump a top level array it will give you back a dictionary. Below are two examples of this behavior

>>> import bson
>>> from bson import json_util
>>> import bsonjs
>>> bson.decode(bsonjs.loads(json_util.dumps(["a", "b", "c"])))
{'0': 'a', '1': 'b', '2': 'c'}
>>> bson.decode(bsonjs.loads(json_util.dumps([])))
{}

One potential solution to this problem is to wrap your list in a dictionary, like so

>>> list = ["a", "b", "c"]
>>> dict = {"data": list}
>>> wrapped = bson.decode(bsonjs.loads(json_util.dumps(dict)))
{'data': ['a', 'b', 'c']}
>>> wrapped["data"]
['a', 'b', 'c']

Installing From Source

python-bsonjs supports CPython 3.8+.

Compiler

You must build python-bsonjs separately for each version of Python. On Windows this means you must use the same C compiler your Python version was built with.

  • Windows build requires Microsoft Visual Studio 2015

Source

You can download the source using git:

$ git clone https://github.com/mongodb-labs/python-bsonjs.git

Install

Once you have the source properly downloaded, build and install the package:

$ pip install -v .

Test

To run the test suite:

$ python -m pytest

About

A fast BSON to MongoDB Extended JSON converter for Python - This Repository is NOT a supported MongoDB product

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published