Description
For some subgraphs, reverting blocks is still slow. The best way to speed this up might be to restrict the queries we run to revert the block by the vid
of the entity versions that are actually affected by the revert. To facilitate that, graph-node
should keep a list of the vid
's of entities by block in memory as it moves forward and processes blocks. That list can then be used to speed up reverts.
Even if we only keep this data for a small number of blocks (say 5), it should help in speeding up reverts already. It's ok if we do not have that data for a revert (e.g., after a cold start), we can just fall back to the current behavior. The amount of data to keep should be relatively small, as mappings typically only alter a small number of entities for each block, but we might want to limit this by only keeping the data if there are fewer than N vid's to keep for a block.
Before implementing this, we should analyze the performance of the current queries and compare it to the performance of queries including the vid
of the entities affected by the rollback. That should also inform the shape of the data we keep in memory, but will likely look like this for the different operations in a specific block:
create
: remember thevid
of the new entityupdate
: remember thevid
of the old and the new version of the entitydelete
: remember thevid
of the deleted entity version
During a revert, we'd then use this information to narrow down which rows in a table to change, for example the query to delete entity versions that are now in the future would become
delete from things where vid in ($vids)
where vids
contains what we recorded as new versions for a create
or update