Skip to content

MDEV-39092 Copy Aria data and logs as part of backup#4971

Draft
mariadb-andrzejjarzabek wants to merge 17 commits into
MariaDB:MDEV-14992from
mariadb-andrzejjarzabek:MDEV-39092
Draft

MDEV-39092 Copy Aria data and logs as part of backup#4971
mariadb-andrzejjarzabek wants to merge 17 commits into
MariaDB:MDEV-14992from
mariadb-andrzejjarzabek:MDEV-39092

Conversation

@mariadb-andrzejjarzabek
Copy link
Copy Markdown
Contributor

This is an initial simple implementation which copies all the Aria files in the "end" phase of the backup. Nothing protects the copy from concurrent DDL or DML. Copying only works on MacOS (intended for refactoring to use common file copy method across engines and SQL layer).

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 22, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment thread include/my_backup.h Outdated
Comment thread include/my_backup.h Outdated
Comment thread include/my_backup.h Outdated
Comment thread mysys/CMakeLists.txt Outdated
Comment thread mysys/my_backup.cc Outdated
Comment thread sql/handler.h Outdated
Comment thread sql/sql_backup.cc Outdated
Comment thread storage/maria/ma_backup.cc Outdated
Comment thread storage/maria/ma_backup.cc Outdated
Comment thread storage/maria/ma_backup.cc Outdated
Comment on lines +108 to +119
int perform_backup() noexcept
{
if (scan_datadir())
return 1;
if (copy_databases())
return 1;
if (copy_control_file())
return 1;
if (copy_logs())
return 1;
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fundamentally incompatible with the handlerton::backup_step API. Even if you are for now copying everything in handlerton::backup_end, the internal API should be kept as compatible with handlerton::backup_step as possible: copying one file at a time. At least the copy_databases() step must be refactored so that the higher-level API is invoking something that copies one file at a time. Possibly, the copying of log files should be interleaved with that, like innodb_backup_step is doing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a stop-gap to get something working. Eventual final implementation will require re-working the hadlerton API and Sql_cmd_backup to take into account that different subsets of files may be copied under diffefent levels of metadata lock, and also "start" and potentially even "end" may need to be split into different phases with different levels of metadata lock. Given that the stage API is written that way to support multi-threaded copy, which isn't implemented at this time, I propose to merge the change as proposed (I will fix some of the other problems you mentioned) and iteratively improve from there.

This introduces a basic driver Sql_cmd_backup, storage engine interfaces,
and basic copying of InnoDB data files.
On Windows, we pass a target directory name; elsewhere, we pass a
target directory handle.

fil_space_t::write_or_backup: Keep track of in-flight page writes and
pending backup operation. We must not allow them concurrently, because
that could lead into torn pages in the backup.

fil_space_t::backup_end: The first page number that is not being backed up
(by default 0, to indicate that no backup is in progress).

TRX_STATE_BACKUP: A special InnoDB transaction state indicating association
with BACKUP SERVER, which allows us to pass some context in trx_t from
innodb_backup_end() to innodb_backup_finalize().

log_t::backup: Whether BACKUP SERVER is in progress. The purpose of this
is to make BACKUP SERVER prevent the concurrent execution of
SET GLOBAL innodb_log_archive=OFF or SET GLOBAL innodb_log_file_size
when innodb_log_archive=OFF.

log_sys.archived_checkpoint: Keep track of the earliest available
checkpoint, corresponding to log_sys.archived_lsn. This reflects
SET GLOBAL innodb_log_recovery_start (which is settable now), for
incremental backup.

buf_flush_list_space(): Check for concurrent backup before writing each
page. This is inefficient, but this function may be invoked from multiple
threads concurrently, and it cannot be changed easily, especially for
fil_crypt_thread().

FIXME: MoveFileEx() after CreateHardLink() fails on Windows

TODO: Implement open(O_DIRECTORY) and openat(2) compatible API on Windows,
to have a uniform interface (passing a target directory handle, not name).

TODO: Implement finer-grained locking around copying page ranges.

TODO: Implement other storage engine interfaces.

TODO: Implement the necessary locking around backup_end.

TODO: Fix the space.get_create_lsn() < checkpoint logic.
@mariadb-andrzejjarzabek mariadb-andrzejjarzabek force-pushed the MDEV-39092 branch 2 times, most recently from 4d6a19c to d86a6a1 Compare May 20, 2026 05:50
backup_target: A structured data type to represent a directory or a
stream. On Microsoft Windows, we must use directory paths because
there is no variant of CopyFileEx() that would work on file handles.

copy_file(): A file copying service for POSIX systems. On Windows,
we will use CopyFileEx().
Fix Windows and macOS
Comment thread sql/sql_backup.cc Outdated
Comment on lines +30 to +38
int copy_entire_file(int src, int dst)
{
return fcopyfile(src, dst, nullptr, COPYFILE_ALL | COPYFILE_CLONE);
}

extern "C" int copy_file(int src, int dst, off_t)
{
return fcopyfile(src, dst, nullptr, COPYFILE_ALL | COPYFILE_CLONE);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is adding unnecessary object code duplication. Macros would allow zero-overhead abstraction:

diff --git a/sql/sql_backup_interface.h b/sql/sql_backup_interface.h
index d781acff3d9..b7d04705985 100644
--- a/sql/sql_backup_interface.h
+++ b/sql/sql_backup_interface.h
@@ -21,6 +21,7 @@
 # include <copyfile.h>
 # define copy_file(src, dst, off) \
   fcopyfile(src, dst, nullptr, COPYFILE_ALL | COPYFILE_CLONE)
+# define copy_entire_file(src, dst) copy_file(src, dst,)
 #else
 # ifdef __cplusplus
 extern "C"
@@ -32,4 +33,15 @@ extern "C"
 @return error code (negative)
 @retval 0   on success */
 int copy_file(int src, int dst, off_t size);
+
+# ifdef __cplusplus
+extern "C"
+# endif
+/** Copy an entire file.
+@param src  source file descriptor
+@param dst  target to append src to
+@param size amount of data to be copied
+@return error code (negative)
+@retval 0   on success */
+int copy_entire_file(int src, int dst);
 #endif

Outside Windows and Apple, it does make sense to introduce a non-inline function for copy_entire_file() even though it rather small (48 bytes for AMD64). I will include this in the next update of #4817.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would defining it as an inline function not be better then? Macros have the problem of replacing tokens lexically, so if there's any identifier in code called copy_file and this header is included, it will get messed up. The fact that it would only happen on a Mac makes this problem even worse.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a mandatory macOS builder https://buildbot.mariadb.org/#/builders/708 that must pass in order for anything to be merged to a main branch. Hence, compilation failures should only be possible in development branches.

The benefit of defining a macro copy_file() is that the third argument will be guaranteed to be omitted from the generated code. If we had an inline function that ignores the third parameter, some unnecessary code could be emitted related to the evaluation of that parameter.

This is an initial simple implementation which copies all the Aria files
in the "end" phase of the backup. Nothing protects the copy from
concurrent DDL or DML. Copying only works on MacOS (intended for
refactoring to use common file copy method across engines and SQL
layer).
Copy non-Aria-specific files *.frm and db.opt as part of Aria backup.
Add MTR test. Copy MyISAM files in Aria plugin so that the MTR works.
Perform the end step under maximum level of backup MDL to safely copy
non-transactional Aria and MyISAM files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants