Skip to content

fix(sqlite): Fix a UNIQUE constraint violation with Update::RemoveItem #5001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Hywan
Copy link
Member

@Hywan Hywan commented May 5, 2025

Imagine we have the following events:

event_id room_id chunk_id position
$ev0 !r0 42 0
$ev1 !r0 42 1
$ev2 !r0 42 2
$ev3 !r0 42 3
$ev4 !r0 42 4

$ev2 has been removed, then we end up in this state:

event_id room_id chunk_id position
$ev0 !r0 42 0
$ev1 !r0 42 1
$ev3 !r0 42 3
$ev4 !r0 42 4

We need to shift the position of $ev3 and $ev4 to position - 1, like so:

event_id room_id chunk_id position
$ev0 !r0 42 0
$ev1 !r0 42 1
$ev3 !r0 42 2
$ev4 !r0 42 3

Usually, it boils down to run the following query:

UPDATE event_chunks
SET position = position - 1
WHERE position > 2 AND

Okay. But UPDATE runs on rows in no particular order. It means that it can update $ev4 before $ev3 for example. What happens in this particular case? The position of $ev4 becomes 3, however $ev3 already has position = 3. Because there is a UNIQUE constraint on (room_id, chunk_id, position), it will result in a constraint violation.

There is no way to control the execution order of UPDATE in SQLite. To persuade yourself, try:

UPDATE event_chunks
SET position = position - 1
FROM (
    SELECT event_id
    FROM event_chunks
    WHERE position > 2 ANDORDER BY position ASC
) as ordered
WHERE event_chunks.event_id = ordered.event_id

It will fail the same way.

Thus, we have 2 solutions:

  1. Remove the UNIQUE constraint,
  2. Be creative.

The UNIQUE constraint is a safe belt. Normally, we have event_cache::Deduplicator that is responsible to ensure there is no duplicated event. However, relying on this is “fragile” in the sense it can contain bugs. Relying on the UNIQUE constraint from SQLite is more robust. It's “braces and belt” as we say here.

So. We need to be creative.

Many solutions exist. Amongst the most popular, we see dropping and re-creating the index, which is no-go for us, it's too expensive. I (@Hywan) have adopted the following one:

  • Do position = position - 1 but in the negative space, so position = -(position - 1). A position cannot be negative; we are sure it is unique!
  • Once all candidate rows are updated, do position = -position to move back to the positive space.

'told you it's gonna be creative.

This solution is a hack, but it is a small number of operations, and we can keep the UNIQUE constraint in place.

This patch updates the test_linked_chunk_remove_item to handle 6 events. On my system, with my SQLite version, it triggers the UNIQUE constraint violation without the bug fix.


Hywan added 2 commits May 5, 2025 16:52
…Item`.

Imagine we have the following events:

| event_id | room_id | chunk_id | position |
|----------|---------|----------|----------|
| $ev0     | !r0     | 42       | 0        |
| $ev1     | !r0     | 42       | 1        |
| $ev2     | !r0     | 42       | 2        |
| $ev3     | !r0     | 42       | 3        |
| $ev4     | !r0     | 42       | 4        |

`$ev2` has been removed, then we end up in this state:

| event_id | room_id | chunk_id | position |
|----------|---------|----------|----------|
| $ev0     | !r0     | 42       | 0        |
| $ev1     | !r0     | 42       | 1        |
|          |         |          |          | <- no more `$ev2`
| $ev3     | !r0     | 42       | 3        |
| $ev4     | !r0     | 42       | 4        |

We need to shift the `position` of `$ev3` and `$ev4` to `position - 1`,
like so:

| event_id | room_id | chunk_id | position |
|----------|---------|----------|----------|
| $ev0     | !r0     | 42       | 0        |
| $ev1     | !r0     | 42       | 1        |
| $ev3     | !r0     | 42       | 2        |
| $ev4     | !r0     | 42       | 3        |

Usually, it boils down to run the following query:

```sql
UPDATE event_chunks
SET position = position - 1
WHERE position > 2 AND …
```

Okay. But `UPDATE` runs on rows in no particular order. It means that
it can update `$ev4` before `$ev3` for example. What happens in this
particular case? The `position` of `$ev4` becomes `3`, however `$ev3`
already has `position = 3`. Because there is a `UNIQUE` constraint
on `(room_id, chunk_id, position)`, it will result in a constraint
violation.

There is **no way** to control the execution order of `UPDATE` in
SQLite. To persuade yourself, try:

```sql
UPDATE event_chunks
SET position = position - 1
FROM (
    SELECT event_id
    FROM event_chunks
    WHERE position > 2 AND …
    ORDER BY position ASC
) as ordered
WHERE event_chunks.event_id = ordered.event_id
```

It will fail the same way.

Thus, we have 2 solutions:

1. Remove the `UNIQUE` constraint,
2. Be creative.

The `UNIQUE` constraint is a safe belt. Normally, we have
`event_cache::Deduplicator` that is responsible to ensure there is no
duplicated event. However, relying on this is “fragile” in the sense it
can contain bugs. Relying on the `UNIQUE` constraint from SQLite is more
robust. It's “braces and belt” as we say here.

So. We need to be creative.

Many solutions exist. Amongst the most popular, we see _dropping and
re-creating the index_, which is no-go for us, it's too expensive. I
(@Hywan) have adopted the following one:

- Do `position = position - 1` but in the negative space, so
 `position = -(position - 1)`. A position cannot be negative; we are
  sure it is unique!
- Once all candidate rows are updated, do `position = -position` to move
  back to the positive space.

'told you it's gonna be creative.

This solution is a hack, **but** it is a small number of operations, and
we can keep the `UNIQUE` constraint in place.

This patch updates the `test_linked_chunk_remove_item` to handle
6 events. On _my_ system, with _my_ SQLite version, it triggers the
`UNIQUE` constraint violation without the bug fix.
@Hywan Hywan marked this pull request as ready for review May 5, 2025 14:59
@Hywan Hywan requested a review from a team as a code owner May 5, 2025 14:59
@Hywan Hywan requested review from poljar and removed request for a team May 5, 2025 14:59
Copy link

codecov bot commented May 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.87%. Comparing base (8be0a7d) to head (21aef94).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5001   +/-   ##
=======================================
  Coverage   85.86%   85.87%           
=======================================
  Files         325      325           
  Lines       35851    35853    +2     
=======================================
+ Hits        30783    30787    +4     
+ Misses       5068     5066    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@stefanceriu stefanceriu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! 👏 Looks good to me and works great, it fixes my TWIM room and #4995.

@Hywan Hywan merged commit 3461b13 into matrix-org:main May 6, 2025
43 of 44 checks passed
@spaetz
Copy link

spaetz commented May 6, 2025

Probably only for the record, BUT sqlite can adhere to ORDER BY on updates, if it has been built to do so. I refer to https://sqlite.org/lang_update.html section 2.3.

@ganfra
Copy link
Contributor

ganfra commented May 7, 2025

Probably only for the record, BUT sqlite can adhere to ORDER BY on updates, if it has been built to do so. I refer to https://sqlite.org/lang_update.html section 2.3.

Yes but :

The ORDER BY clause on an UPDATE statement is used only to determine which rows fall within the LIMIT. The order in which rows are modified is arbitrary and is not influenced by the ORDER BY clause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

event cache: unique constraint failures
4 participants