Skip to content

Auto-prune documents on search #413

@ellnix

Description

@ellnix

Currently, the meilisearch database and the ORM commonly get out of sync if records are deleted while Meilisearch is disabled, or offline for whatever reason. Since Model#reindex! does not delete any invalid meili documents, a user would have to be diligent enough to manually delete the index and reindex from the start.

This makes it really hard to guarantee that every page has the same number of records (or that it even has any real records at all).

If we decided to solve this by looking ahead to the next page to get the correct number of documents, we would end up repeating the same documents on multiple pages.

I propose a system like this:

  • When we search for page n
  • we keep track of a prune_list: the IDs for which there is no record
  • if we find fewer records than hits_per_page specifies
  • we run another search on page n + 1, repeat the process until we have enough records
  • we delete the documents in prune_list
    This ensures that the results we get comprise all of page n, when the request is made for a search on n + 1, the first document follows right after the results of page n.
  • we return the correct number of results

Here's a rough diagram that doesn't follow any particular diagramming language:
image

Edit: Of course with this I would also propose that a we add an auto_prune option to the meilisearch function, so that users can disable this behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions