-
Notifications
You must be signed in to change notification settings - Fork 290
fix(sqlite): Fix a UNIQUE
constraint violation with Update::RemoveItem
#5001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sqlite): Fix a UNIQUE
constraint violation with Update::RemoveItem
#5001
Conversation
…Item`. Imagine we have the following events: | event_id | room_id | chunk_id | position | |----------|---------|----------|----------| | $ev0 | !r0 | 42 | 0 | | $ev1 | !r0 | 42 | 1 | | $ev2 | !r0 | 42 | 2 | | $ev3 | !r0 | 42 | 3 | | $ev4 | !r0 | 42 | 4 | `$ev2` has been removed, then we end up in this state: | event_id | room_id | chunk_id | position | |----------|---------|----------|----------| | $ev0 | !r0 | 42 | 0 | | $ev1 | !r0 | 42 | 1 | | | | | | <- no more `$ev2` | $ev3 | !r0 | 42 | 3 | | $ev4 | !r0 | 42 | 4 | We need to shift the `position` of `$ev3` and `$ev4` to `position - 1`, like so: | event_id | room_id | chunk_id | position | |----------|---------|----------|----------| | $ev0 | !r0 | 42 | 0 | | $ev1 | !r0 | 42 | 1 | | $ev3 | !r0 | 42 | 2 | | $ev4 | !r0 | 42 | 3 | Usually, it boils down to run the following query: ```sql UPDATE event_chunks SET position = position - 1 WHERE position > 2 AND … ``` Okay. But `UPDATE` runs on rows in no particular order. It means that it can update `$ev4` before `$ev3` for example. What happens in this particular case? The `position` of `$ev4` becomes `3`, however `$ev3` already has `position = 3`. Because there is a `UNIQUE` constraint on `(room_id, chunk_id, position)`, it will result in a constraint violation. There is **no way** to control the execution order of `UPDATE` in SQLite. To persuade yourself, try: ```sql UPDATE event_chunks SET position = position - 1 FROM ( SELECT event_id FROM event_chunks WHERE position > 2 AND … ORDER BY position ASC ) as ordered WHERE event_chunks.event_id = ordered.event_id ``` It will fail the same way. Thus, we have 2 solutions: 1. Remove the `UNIQUE` constraint, 2. Be creative. The `UNIQUE` constraint is a safe belt. Normally, we have `event_cache::Deduplicator` that is responsible to ensure there is no duplicated event. However, relying on this is “fragile” in the sense it can contain bugs. Relying on the `UNIQUE` constraint from SQLite is more robust. It's “braces and belt” as we say here. So. We need to be creative. Many solutions exist. Amongst the most popular, we see _dropping and re-creating the index_, which is no-go for us, it's too expensive. I (@Hywan) have adopted the following one: - Do `position = position - 1` but in the negative space, so `position = -(position - 1)`. A position cannot be negative; we are sure it is unique! - Once all candidate rows are updated, do `position = -position` to move back to the positive space. 'told you it's gonna be creative. This solution is a hack, **but** it is a small number of operations, and we can keep the `UNIQUE` constraint in place. This patch updates the `test_linked_chunk_remove_item` to handle 6 events. On _my_ system, with _my_ SQLite version, it triggers the `UNIQUE` constraint violation without the bug fix.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5001 +/- ##
=======================================
Coverage 85.86% 85.87%
=======================================
Files 325 325
Lines 35851 35853 +2
=======================================
+ Hits 30783 30787 +4
+ Misses 5068 5066 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done! 👏 Looks good to me and works great, it fixes my TWIM room and #4995.
Probably only for the record, BUT sqlite can adhere to ORDER BY on updates, if it has been built to do so. I refer to https://sqlite.org/lang_update.html section 2.3. |
Yes but :
|
Imagine we have the following events:
$ev2
has been removed, then we end up in this state:We need to shift the
position
of$ev3
and$ev4
toposition - 1
, like so:Usually, it boils down to run the following query:
Okay. But
UPDATE
runs on rows in no particular order. It means that it can update$ev4
before$ev3
for example. What happens in this particular case? Theposition
of$ev4
becomes3
, however$ev3
already hasposition = 3
. Because there is aUNIQUE
constraint on(room_id, chunk_id, position)
, it will result in a constraint violation.There is no way to control the execution order of
UPDATE
in SQLite. To persuade yourself, try:It will fail the same way.
Thus, we have 2 solutions:
UNIQUE
constraint,The
UNIQUE
constraint is a safe belt. Normally, we haveevent_cache::Deduplicator
that is responsible to ensure there is no duplicated event. However, relying on this is “fragile” in the sense it can contain bugs. Relying on theUNIQUE
constraint from SQLite is more robust. It's “braces and belt” as we say here.So. We need to be creative.
Many solutions exist. Amongst the most popular, we see dropping and re-creating the index, which is no-go for us, it's too expensive. I (@Hywan) have adopted the following one:
position = position - 1
but in the negative space, soposition = -(position - 1)
. A position cannot be negative; we are sure it is unique!position = -position
to move back to the positive space.'told you it's gonna be creative.
This solution is a hack, but it is a small number of operations, and we can keep the
UNIQUE
constraint in place.This patch updates the
test_linked_chunk_remove_item
to handle 6 events. On my system, with my SQLite version, it triggers theUNIQUE
constraint violation without the bug fix.