PS-10231 [9.x]: Fix DBLWR recovery tests for compressed+encrypted pages by inikep · Pull Request #5844 · percona/percona-server

inikep · 2026-02-26T14:19:17Z

The innodb.dblwr_lz4_encrypt_recv and innodb.dblwr_zlib_encrypt_recv tests were failing because the DBLWR copy of the test table's root page was being overwritten by background flush activity (undo purge, system tablespace) between FLUSH TABLES FOR EXPORT and SIGKILL. This race became more likely after Bug#37684656 reduced the DBLWR buffer size.

Additionally, without pending redo records for the test tablespace after checkpoint, crash recovery never opened it, so the per-space DBLWR recovery path never executed.

Restructure both tests to follow the robust pattern used by innodb.dblwr_encrypt_recover:

Wait for purge to complete before flushing
Disable master thread and checkpoint after flush to prevent background DBLWR slot reuse
Perform an uncommitted INSERT to generate pending redo records, ensuring the tablespace is opened during crash recovery
Kill the server first, then corrupt the page externally while the server is down (guaranteeing the DBLWR copy survives)
Zero the entire page (ALL_ZEROES=1) because for compressed pages with punch hole, the second half is already zeros so partial corruption has no effect
Add master.opt with --innodb_doublewrite_pages=512 for extra margin

The innodb.dblwr_lz4_encrypt_recv and innodb.dblwr_zlib_encrypt_recv tests were failing because the DBLWR copy of the test table's root page was being overwritten by background flush activity (undo purge, system tablespace) between FLUSH TABLES FOR EXPORT and SIGKILL. This race became more likely after Bug#37684656 reduced the DBLWR buffer size. Additionally, without pending redo records for the test tablespace after checkpoint, crash recovery never opened it, so the per-space DBLWR recovery path never executed. Restructure both tests to follow the robust pattern used by innodb.dblwr_encrypt_recover: - Wait for purge to complete before flushing - Disable master thread and checkpoint after flush to prevent background DBLWR slot reuse - Perform an uncommitted INSERT to generate pending redo records, ensuring the tablespace is opened during crash recovery - Kill the server first, then corrupt the page externally while the server is down (guaranteeing the DBLWR copy survives) - Zero the entire page (ALL_ZEROES=1) because for compressed pages with punch hole, the second half is already zeros so partial corruption has no effect - Add master.opt with --innodb_doublewrite_pages=512 for extra margin

inikep · 2026-02-26T14:20:34Z

The issue is https://ps80.cd.percona.com/job/percona-server-9.x-pipeline-parallel-mtr/259/testReport/junit/oraclelinux-8.Debug.WORKER_8/innodb/dblwr_lz4_encrypt_recv/

percona-ysorokin

LGTM

inikep requested a review from percona-ysorokin February 26, 2026 14:19

percona-ysorokin approved these changes Feb 26, 2026

View reviewed changes

inikep merged commit 0fa8790 into percona:trunk Mar 2, 2026
3 of 6 checks passed

inikep deleted the PS-10231-9.x branch March 2, 2026 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PS-10231 [9.x]: Fix DBLWR recovery tests for compressed+encrypted pages#5844

PS-10231 [9.x]: Fix DBLWR recovery tests for compressed+encrypted pages#5844
inikep merged 1 commit intopercona:trunkfrom
inikep:PS-10231-9.x

inikep commented Feb 26, 2026

Uh oh!

inikep commented Feb 26, 2026

Uh oh!

percona-ysorokin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

inikep commented Feb 26, 2026

Uh oh!

inikep commented Feb 26, 2026

Uh oh!

percona-ysorokin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants