Skip to content

Slow joins for ManyToMany fields #309

Open
@skonstant

Description

@skonstant

I am running into a tricky issue, the joins are slow for my ManyToMany fields.

observation.subscopes.all()

Generates this:

[{ '$lookup': { 'from': 'orm_observation_subscopes', 'let': { 'parent__field__0': '$_id' }, 'pipeline': [{ '$match': { '$expr': { '$and': [{ '$eq': ['$$parent__field__0', '$subscope_id'] }] } } }], 'as': 'orm_observation_subscopes' } }, { '$unwind': '$orm_observation_subscopes' }, { '$match': { '$expr': { '$eq': ['$orm_observation_subscopes.observation_id', ObjectId('6836e1278c5bb0a20fa1aed1')] } } }, { '$facet': { 'group': [{ '$group': { '__count': { '$sum': { '$cond': { 'if': { '$in': [{ '$type': { '$literal': True } }, ['missing', 'null'] ] }, 'then': None, 'else': 1 } } }, '_id': None } }] } }, { '$addFields': { '__count': { '$getField': { 'input': { '$arrayElemAt': ['$group', 0] }, 'field': '__count' } }, '_id': { '$getField': { 'input': { '$arrayElemAt': ['$group', 0] }, 'field': '_id' } } } }, { '$project': { '__count': { '$ifNull': ['$__count', { '$literal': 0 }] } } }]

And this takes 3 seconds because the "pivot table" is very large, even though the join is on a specific "observation id", a field that is indexed.

If I move the match inside the lookup, the same qery takes milliseconds.

[{ '$lookup': { 'from': 'orm_observation_subscopes', 'let': { 'parent__field__0': '$_id' }, 'pipeline': [{ '$match': { '$expr': { '$and': [{ '$eq': ['$$parent__field__0', '$subscope_id'] },{ '$eq': ['$observation_id', ObjectId('6836e1718c5bb0a20fab6467')] }] } } }], 'as': 'orm_observation_subscopes' } }, { '$unwind': '$orm_observation_subscopes' }, { '$limit': 21 }]

I took a quick look at the code that generates the lookup and at first sight I was not clear where to make the change, I'll dig a bit deeper but perhaps you already know where to look to implement the fix.

This is particularly penalizing, as we use a few foreign keys and many to manys on fairly large collections, and would like to keep it this way.

It looks like the fix in the query is simple, it is probably much more complex in the generator.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions