Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite Error 14 - Loss of ideas :) #1741

Open
freak4pc opened this issue Mar 5, 2025 · 3 comments
Open

SQLite Error 14 - Loss of ideas :) #1741

freak4pc opened this issue Mar 5, 2025 · 3 comments
Labels

Comments

@freak4pc
Copy link
Contributor

freak4pc commented Mar 5, 2025

Hey there,
We've been experiencing intermittent SQLIte Error 14 specifically in production (we're unable to reproduce).
I'm aware of it being a general catch-all "Can't open" error, and did see some issues like this one: #1351

Unfortunately I'm unsure how any of these apply to us

I've tried, as a fix, to catch these errors and in those cases entirely remove all local database files and re-create the DB. Doesn't seem to mitigate.

My only suspicion at this point is WAL-mode, but I don't think we're doing anything funky with it. It's mostly like the docs.

I'm wondering if there's any substantial loss of dropping to a DatabaseQueue to see if it solves the problem, or if it can solve the problem at all.

The second option is that our dbWriter ends up nil somehow after the coordination operation, but I'm not sure how that's possible. I will attempt to change the thrown error there to something else to confirm, but don't think it's the culprit.

Here's our openDatabase:

@discardableResult
private func openDatabase(at databaseURL: URL) throws -> GRDB.DatabaseWriter {
    // Using a file coordinator protects the database against concurrent
    // access. See: https://bit.ly/352DYrH
    let coordinator = NSFileCoordinator(filePresenter: nil)
    var coordinatorError: NSError?
    var dbError: Swift.Error?

    coordinator.coordinate(
        writingItemAt: databaseURL,
        options: .forMerging,
        error: &coordinatorError
    ) { url in
        do {
            configuration.prepareDatabase { db in
                // Activate the persistent WAL mode so that
                // read-only processes can access the database.
                //
                // See https://www.sqlite.org/walformat.html#operations_that_require_locks_and_which_locks_those_operations_use
                // and https://www.sqlite.org/c3ref/c_fcntl_begin_atomic_write.html#sqlitefcntlpersistwal
                if db.configuration.readonly == false {
                    var flag: CInt = 1
                    let code = withUnsafeMutablePointer(to: &flag) { flagP in
                        sqlite3_file_control(db.sqliteConnection, nil, SQLITE_FCNTL_PERSIST_WAL, flagP)
                    }
                    guard code == SQLITE_OK else {
                        throw DatabaseError(resultCode: ResultCode(rawValue: code))
                    }
                }
            }

            if usesInMemoryDatabase {
                self.dbWriter = try DatabaseQueue()
            } else {
                self.dbWriter = try DatabasePool(
                    path: url.path,
                    configuration: configuration
                )
            }
        } catch {
            dbError = error
        }
    }

    if let error = dbError ?? coordinatorError {
        throw error
    }

    guard let dbWriter else {
        throw DatabaseError(resultCode: DatabaseError.SQLITE_CANTOPEN)
    }

    return dbWriter
}

Any ideas will be greatly appreciated.
Thanks!

@groue
Copy link
Owner

groue commented Mar 5, 2025

Hello @freak4pc,

I don't have much new information about SQLITE_CANTOPEN (14) since #1351.

Here are some possible actions for you:

First check which operation throws SQLITE_CANTOPEN.

  • SQLITE_CANTOPEN is thrown when opening the database?

    We are almost 100% sure that the database was corrupted, and I'd seriously look at the code that performs direct manipulation of the database files, at the level of the file system, with FileManager. Messing up with SQLite database and temporary files can easily lead to corruption.

  • SQLITE_CANTOPEN is thrown during a read?

    Then probably the -wal and -shm temporary files are missing. GRDB normally takes care of creating them if they are missing, when the database is opened, but there are scenarios where those files might be deleted before a new read-only connection is opened.

    • The app runs a WAL checkpoint? If your app perform a checkpoint, a mitigation technique would be to recreate the -wal and -shm temporary files by performing a dummy transaction (create and delete a dummy table, for example). There's a race condition because a concurrent thread or process might want to read after the checkpoint and before the temporary files are recreated. barrierWriteWithoutTransaction can eliminate the race condition in one process, but not for other processes. Also consider that there are lighter checkpoints that do not erase the -wal and -shm temporary files: refer to the SQLite documentation.

    • Another process closes its own database connection? Closing a connection can delete the -wal and -shm temporary files, in some circumstances and system versions (quite unclear). This reminds me of Can the WAL mode files creation (-shm and -wal) be avoided in read-only access ? #771 (comment), and you may try to activate the persistent WAL mode in the involved processes:

      var configuration = Configuration()
      configuration.prepareDatabase = { db in
          // Activate the persistent WAL mode
          if db.configuration.readonly == false {
              var flag: CInt = 1
              let code = withUnsafeMutablePointer(to: &flag) { flagP in
                  sqlite3_file_control(db.sqliteConnection, nil, SQLITE_FCNTL_PERSIST_WAL, flagP)
              }
              guard code == SQLITE_OK else {
                  throw DatabaseError(resultCode: ResultCode(rawValue: code))
              }
          }
      }
      let dbPool = try DatabasePool(path: ..., configuration: configuration)

      (Your app is already setting this flag.)

Please report your findings!

@groue groue added the support label Mar 5, 2025
@freak4pc
Copy link
Contributor Author

freak4pc commented Mar 5, 2025

Thanks you!

In regards to corruption and WAL I would expect it to solve itself with the fix that destroys the DB files (folder) entirely and opens a new one - since those are supposed to create the WAL files as well IIRC?

I'm wondering if switching to DatabaseQueue could be helpful. I don't think we're directly using anything special from DatabasePool but we've been using it for a while so unsure at this point. Is there any meaningful risk in trying that? Could it help?

Thank you :)

@groue
Copy link
Owner

groue commented Mar 5, 2025

Thanks you!

In regards to corruption and WAL I would expect it to solve itself with the fix that destroys the DB files (folder) entirely and opens a new one - since those are supposed to create the WAL files as well IIRC?

Yes, it does.

I'm wondering if switching to DatabaseQueue could be helpful. I don't think we're directly using anything special from DatabasePool but we've been using it for a while so unsure at this point. Is there any meaningful risk in trying that? Could it help?

I'm just as puzzled as you are, so all bets are off. I wouldn't prevent you from trying, of course.

The behavior difference between DatabaseQueue and DatabasePool is that DatabaseQueue serializes all db accesses, when DatabasePool allows parallel reads and writes. For example, imagine your app is performing a slow write transaction in the background (such as saving a lot of remote data to disk). With a pool, the app can fetch and display database values (they'll eventually update, after the write has completed, if the app is observing them). With a queue, the app has to wait until the write transaction has ended before it can fetch and display database values. With a queue, there is a risk that the main thread hangs (if it performs synchronous reads), or that the UI displays more loading screens (if it performs async reads).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants