Skip to content

Make QuerySet.explain() return parsable JSON #340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 21, 2025
Merged

Conversation

Jibola
Copy link
Contributor

@Jibola Jibola commented Jul 16, 2025

Context

Calling explain() is extremely useful in debugging a MongoDB query. However, parsing the output of the explain call is a nightmare. This is because we take each line and format them using pprint to accommodate Django's native explain functionality, which joins all information line by line.

Solution

Rather than split key/values into multiple lines, we should just dump the json as one string blob in the list. This way .explain can easily leverage json.load or json_util.load.

  • Confirm the fix
  • Create a test_explain test case
  • Update the changelog

Changes in this PR

  • import PyMongo library json_util
  • call json_util.dumps(..., indent=4) and return that in a list of length 1.

Change

Before

>>> exp = Author.objects.filter().explain()
>>> json.loads(exp)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
...
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

After

image

Copy link
Collaborator

@aclark4life aclark4life left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, less pprint too!

@timgraham
Copy link
Collaborator

To me, it's not an improvement in readability.

Before:

>>> print(Question.objects.explain())
explainVersion: '1'
queryPlanner: {   'indexFilterSet': False,
    'maxIndexedAndSolutionsReached': False,
    'maxIndexedOrSolutionsReached': False,
    'maxScansToExplodeReached': False,
    'namespace': 'mysite.polls_question',
    'optimizationTimeMillis': 0,
    'optimizedPipeline': True,
    'parsedQuery': {},
    'planCacheKey': '7DF350EE',
    'planCacheShapeHash': '8F2383EE',
    'prunedSimilarIndexes': False,
    'queryHash': '8F2383EE',
    'rejectedPlans': [],
    'winningPlan': {   'direction': 'forward',
                       'isCached': False,
                       'stage': 'COLLSCAN'}}
executionStats: {   'allPlansExecution': [],
    'executionStages': {   'advanced': 0,
                           'direction': 'forward',
                           'docsExamined': 0,
                           'executionTimeMillisEstimate': 0,
                           'isCached': False,
                           'isEOF': 1,
                           'nReturned': 0,
                           'needTime': 0,
                           'needYield': 0,
                           'restoreState': 0,
                           'saveState': 0,
                           'stage': 'COLLSCAN',
                           'works': 1},
    'executionSuccess': True,
    'executionTimeMillis': 0,
    'nReturned': 0,
    'totalDocsExamined': 0,
    'totalKeysExamined': 0}
queryShapeHash: '7229101CA7C854EFFD9939CFFED9E674B0B07394314E0D9379C20096DE409F8A'
command: {   '$db': 'mysite',
    'aggregate': 'polls_question',
    'cursor': {},
    'pipeline': [{'$match': {'$expr': {}}}]}
serverInfo: {   'gitVersion': 'bed99f699da6cb2b74262aa6d473446c41476643',
    'host': 'barkley',
    'port': 27017,
    'version': '8.0.11'}
serverParameters: {   'internalDocumentSourceGroupMaxMemoryBytes': 104857600,
    'internalDocumentSourceSetWindowFieldsMaxMemoryBytes': 104857600,
    'internalLookupStageIntermediateDocumentMaxSizeBytes': 104857600,
    'internalQueryFacetBufferSizeBytes': 104857600,
    'internalQueryFacetMaxOutputDocSizeBytes': 104857600,
    'internalQueryFrameworkControl': 'trySbeRestricted',
    'internalQueryMaxAddToSetBytes': 104857600,
    'internalQueryMaxBlockingSortMemoryUsageBytes': 104857600,
    'internalQueryPlannerIgnoreIndexWithCollationForRegex': 1,
    'internalQueryProhibitBlockingMergeOnMongoS': 0}

After:

>>> print(Question.objects.explain())
{"explainVersion": "1", "queryPlanner": {"namespace": "mysite.polls_question", "parsedQuery": {}, "indexFilterSet": false, "queryHash": "8F2383EE", "planCacheShapeHash": "8F2383EE", "planCacheKey": "7DF350EE", "optimizationTimeMillis": 0, "optimizedPipeline": true, "maxIndexedOrSolutionsReached": false, "maxIndexedAndSolutionsReached": false, "maxScansToExplodeReached": false, "prunedSimilarIndexes": false, "winningPlan": {"isCached": false, "stage": "COLLSCAN", "direction": "forward"}, "rejectedPlans": []}, "executionStats": {"executionSuccess": true, "nReturned": 0, "executionTimeMillis": 1, "totalKeysExamined": 0, "totalDocsExamined": 0, "executionStages": {"isCached": false, "stage": "COLLSCAN", "nReturned": 0, "executionTimeMillisEstimate": 0, "works": 1, "advanced": 0, "needTime": 0, "needYield": 0, "saveState": 0, "restoreState": 0, "isEOF": 1, "direction": "forward", "docsExamined": 0}, "allPlansExecution": []}, "queryShapeHash": "7229101CA7C854EFFD9939CFFED9E674B0B07394314E0D9379C20096DE409F8A", "command": {"aggregate": "polls_question", "pipeline": [{"$match": {"$expr": {}}}], "cursor": {}, "$db": "mysite"}, "serverInfo": {"host": "barkley", "port": 27017, "version": "8.0.11", "gitVersion": "bed99f699da6cb2b74262aa6d473446c41476643"}, "serverParameters": {"internalQueryFacetBufferSizeBytes": 104857600, "internalQueryFacetMaxOutputDocSizeBytes": 104857600, "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600, "internalDocumentSourceGroupMaxMemoryBytes": 104857600, "internalQueryMaxBlockingSortMemoryUsageBytes": 104857600, "internalQueryProhibitBlockingMergeOnMongoS": 0, "internalQueryMaxAddToSetBytes": 104857600, "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600, "internalQueryFrameworkControl": "trySbeRestricted", "internalQueryPlannerIgnoreIndexWithCollationForRegex": 1}, "ok": 1.0}

result.append(f"{key}: {formatted_value}")
return result
# explain() expects a list and joins on a newline. Concatenate no lines
return [json_util.dumps(explain)]
Copy link
Collaborator

@WaVEV WaVEV Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use

Suggested change
return [json_util.dumps(explain)]
return [json_util.dumps(explain, indent=4, ensure_ascii=False)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since json_util.dumps() is a pymongo specific json parsing function, I don't think we'll need the ensure_ascii=False override.

Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about ensure_ascii, but I'm wondering if you expect some difference in the output by using json_util.dumps() instead of json.dumps()?

We didn't have tests in this repo when I originally implemented this, but it would be useful to now have at least one test of the output in tests/queries_/test_explain.py (new file).

I'm fine with ident=4 but just to be clear, that introduces the same "nightmare to parse" issue of newlines. Basically, your original usage mistake was not using print(Model.objects.explain()) so the newlines weren't rendered nicely. I think it's fine to make this change anyway. Incidentally, is there a use case for calling json.loads() on the result of explain() or was that just an attempt at making the output more readable?

Copy link
Contributor Author

@Jibola Jibola Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about ensure_ascii, but I'm wondering if you expect some difference in the output by using json_util.dumps() instead of json.dumps()?

json_util properly handles the case of non-json-serializable BSON types (I.e. ObjectId()). We know we're getting a dictionary from a mongodb query and we want it to be parseable. This allows everything to be viewed, and if it ever fails, that's a bug against PyMongo rather than this library.

We didn't have tests in this repo when I originally implemented this, but it would be useful to now have at least one test of the output in tests/queries_/test_explain.py (new file).

Sure. I can add that.

I'm fine with ident=4 but just to be clear, that introduces the same "nightmare to parse" issue of newlines. ...

It actually doesn't. It does keep the \n ticks in pprint, but in a much more readable way.
For me it's also a QOL issue. For exceptionally large queries, it gets nauseating to read an entire rendered print output so my standard workflow is adding it to a dictionary and then iterate through the query each piece. So in this new world all three paths become viable:

  • print continues to work as designed
  • loads can now give a manageable dictionary
  • pprint gives an arguably easier to parse JSON blob. image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure_ascii option was to handling some letters, like Spanish letter ñ, or ó, and so on. If this flag was in true, those letter got broken.

Copy link
Collaborator

@WaVEV WaVEV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json dump can be prettified

@Jibola
Copy link
Contributor Author

Jibola commented Jul 17, 2025

To me, it's not an improvement in readability.

Before:

>>> print(Question.objects.explain())
explainVersion: '1'
queryPlanner: {   'indexFilterSet': False,
    'maxIndexedAndSolutionsReached': False,
    'maxIndexedOrSolutionsReached': False,
    'maxScansToExplodeReached': False,
    'namespace': 'mysite.polls_question',
    'optimizationTimeMillis': 0,
    'optimizedPipeline': True,
    'parsedQuery': {},
    'planCacheKey': '7DF350EE',
    'planCacheShapeHash': '8F2383EE',
    'prunedSimilarIndexes': False,
    'queryHash': '8F2383EE',
    'rejectedPlans': [],
    'winningPlan': {   'direction': 'forward',
                       'isCached': False,
                       'stage': 'COLLSCAN'}}
executionStats: {   'allPlansExecution': [],
    'executionStages': {   'advanced': 0,
                           'direction': 'forward',
                           'docsExamined': 0,
                           'executionTimeMillisEstimate': 0,
                           'isCached': False,
                           'isEOF': 1,
                           'nReturned': 0,
                           'needTime': 0,
                           'needYield': 0,
                           'restoreState': 0,
                           'saveState': 0,
                           'stage': 'COLLSCAN',
                           'works': 1},
    'executionSuccess': True,
    'executionTimeMillis': 0,
    'nReturned': 0,
    'totalDocsExamined': 0,
    'totalKeysExamined': 0}
queryShapeHash: '7229101CA7C854EFFD9939CFFED9E674B0B07394314E0D9379C20096DE409F8A'
command: {   '$db': 'mysite',
    'aggregate': 'polls_question',
    'cursor': {},
    'pipeline': [{'$match': {'$expr': {}}}]}
serverInfo: {   'gitVersion': 'bed99f699da6cb2b74262aa6d473446c41476643',
    'host': 'barkley',
    'port': 27017,
    'version': '8.0.11'}
serverParameters: {   'internalDocumentSourceGroupMaxMemoryBytes': 104857600,
    'internalDocumentSourceSetWindowFieldsMaxMemoryBytes': 104857600,
    'internalLookupStageIntermediateDocumentMaxSizeBytes': 104857600,
    'internalQueryFacetBufferSizeBytes': 104857600,
    'internalQueryFacetMaxOutputDocSizeBytes': 104857600,
    'internalQueryFrameworkControl': 'trySbeRestricted',
    'internalQueryMaxAddToSetBytes': 104857600,
    'internalQueryMaxBlockingSortMemoryUsageBytes': 104857600,
    'internalQueryPlannerIgnoreIndexWithCollationForRegex': 1,
    'internalQueryProhibitBlockingMergeOnMongoS': 0}

After:

>>> print(Question.objects.explain())
{"explainVersion": "1", "queryPlanner": {"namespace": "mysite.polls_question", "parsedQuery": {}, "indexFilterSet": false, "queryHash": "8F2383EE", "planCacheShapeHash": "8F2383EE", "planCacheKey": "7DF350EE", "optimizationTimeMillis": 0, "optimizedPipeline": true, "maxIndexedOrSolutionsReached": false, "maxIndexedAndSolutionsReached": false, "maxScansToExplodeReached": false, "prunedSimilarIndexes": false, "winningPlan": {"isCached": false, "stage": "COLLSCAN", "direction": "forward"}, "rejectedPlans": []}, "executionStats": {"executionSuccess": true, "nReturned": 0, "executionTimeMillis": 1, "totalKeysExamined": 0, "totalDocsExamined": 0, "executionStages": {"isCached": false, "stage": "COLLSCAN", "nReturned": 0, "executionTimeMillisEstimate": 0, "works": 1, "advanced": 0, "needTime": 0, "needYield": 0, "saveState": 0, "restoreState": 0, "isEOF": 1, "direction": "forward", "docsExamined": 0}, "allPlansExecution": []}, "queryShapeHash": "7229101CA7C854EFFD9939CFFED9E674B0B07394314E0D9379C20096DE409F8A", "command": {"aggregate": "polls_question", "pipeline": [{"$match": {"$expr": {}}}], "cursor": {}, "$db": "mysite"}, "serverInfo": {"host": "barkley", "port": 27017, "version": "8.0.11", "gitVersion": "bed99f699da6cb2b74262aa6d473446c41476643"}, "serverParameters": {"internalQueryFacetBufferSizeBytes": 104857600, "internalQueryFacetMaxOutputDocSizeBytes": 104857600, "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600, "internalDocumentSourceGroupMaxMemoryBytes": 104857600, "internalQueryMaxBlockingSortMemoryUsageBytes": 104857600, "internalQueryProhibitBlockingMergeOnMongoS": 0, "internalQueryMaxAddToSetBytes": 104857600, "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600, "internalQueryFrameworkControl": "trySbeRestricted", "internalQueryPlannerIgnoreIndexWithCollationForRegex": 1}, "ok": 1.0}

Ah, to @WaVEV 's point, that should be mitigated with indent=4 I'll update the before & after to reflect.

@timgraham timgraham changed the title Make explain() yield a mongodb-compatible dumped json Make QuerySet.explain() return JSON Jul 18, 2025
@Jibola Jibola requested a review from WaVEV July 18, 2025 16:53
Copy link
Collaborator

@WaVEV WaVEV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the only thing: what if we have a character like ñ

@timgraham timgraham changed the title Make QuerySet.explain() return JSON Make QuerySet.explain() return parsable JSON Jul 19, 2025
@timgraham timgraham force-pushed the simplify-explain branch 2 times, most recently from 6706112 to abc335f Compare July 19, 2025 22:59
@Jibola Jibola merged commit 6b5d00c into main Jul 21, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants