Skip to content

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Aug 14, 2025

Implements Nexus's db_metadata_nexus_state table, to consider access by other Nexuses
that may be concurrently executing with older schemas.

Database Schema & Migration

  • Added db_metadata_nexus table to track Nexus access with states: active, not_yet, inactive
  • Migration automatically populates records for existing Nexus zones from the current target blueprint

Validation

  • New check_schema_and_access() function validates both schema version compatibility and Nexus access
  • SchemaAction enum guides database initialization based on "access" / "schema" combinations
  • attempt_handoff() function enables atomic transition of Nexus access from not_yet to active states

Backwards compatibility

  • Backward compatibility for deployments upgrading from pre-existing schemas, and support added to populate new deployments
  • Support for both explicit Nexus IDs and omitted Nexus IDs (for the schema updater binary)

Fixes #8501

@smklein smklein force-pushed the db_metadata_nexus branch 4 times, most recently from 7b98014 to 7bafa74 Compare August 15, 2025 18:55

SET LOCAL disallow_full_table_scans = off;

INSERT INTO omicron.public.db_metadata_nexus (nexus_id, last_drained_blueprint_id, state)
Copy link
Collaborator Author

@smklein smklein Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be possible to not include this populate step, depending on how reconfigurator execution plans on populating these records. After all, the "no db_metadata_nexus records" case is already treated specially for backwards-compatibility.

This would also let us delete the data migrations in nexus/tests/integration_tests/schema.rs

/// Returns an error if:
/// - Any db_metadata_nexus records already exist (should only be called
/// during initial setup)
pub async fn initialize_nexus_access_from_blueprint_on_connection(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are okay existing in a period of time where db_metadata_nexus records do not exist, but blueprint execution could otherwise be functional, this change may not be necessary.

However, I think the presence of active records for live Nexuses acts as a strong guard against quiescing, as documented in https://rfd.shared.oxide.computer/rfd/588 , so they are populated here, within RSS setup.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like the approach you've got here.

@smklein smklein force-pushed the db_metadata_nexus branch from 7bafa74 to d70e41f Compare August 21, 2025 21:22
@smklein smklein changed the title Create db_metadata_nexus_state table Create db_metadata_nexus_state table Aug 21, 2025
@smklein smklein marked this pull request as ready for review August 21, 2025 21:28
Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have a lot of small comments here but I think this is close to what we've discussed.

One lingering thing that makes me nervous is that there are so many implicit assumptions and constraints on the datastore functions in db_metadata. This is not a blocker for this PR! But I wonder if this would benefit from an approach that used distinct types for the different phases. I'll think about this and bring it up elsewhere.


// Before proceeding, all records must be in the "inactive" or "not_yet" states.
//
// We explicitly look for any records violating this, rather than explicitly looking for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels less future-proofed to me, on the grounds that if we added some new state that's logically like one of these other two states, we'll erroneously not include it here and so not notice something in that state. It feels safer to me to look for active explicitly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the grounds that if we added some new state that's logically like one of these other two states

What if the new state is logically like "active"? Or needs different handling than these other two states? Perhaps it's like unquiescing_to_deal_with_expungement to try to deal with https://rfd.shared.oxide.computer/rfd/0588#_trying_to_handle_permanent_failures_during_handoff

Truly, I have no idea what kind of future state we would want, but if I use:

let active_count = dsl::db_metadata_nexus.filter(dsl::state.eq(active))

// Proceed if "active_count" > 0
...

Then I wouldn't be handling this case correctly.

I was trying to follow the conditions for handoff we agreed on in RFD 588:

To carry out the handoff:
Precondition: all records in this table must have state not_yet or quiesced.

Anything other state - whether it's active, unquiescing_to_deal_with_expungement, or something else - would violate that constraint, as written.

I think this might be more obvious if I renamed this variable from active_count to not_not_yet_and_not_quiesced_count but that feels much wordier.

} = identity_check
else {
return Err(BackoffError::permanent(
"Nexus ID needed for handoff",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should truly be impossible, right? I assume we wouldn't have returned NeedsHandoff if the identity check policy was DontCare. I don't think we should crash or anything but just wanted to be clear on my understanding and I think it's worth a comment to this effect. In the future it'd be great if we could rework it so this case wasn't representable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is within the implementation details of check_schema_and_access, but NeedsHandoff is only returned when access is DoesNotHaveAccessYet. This can only be returned by DataStore::check_nexus_access, which is only invoked when the IdentityCheckPolicy::CheckAndTakeover variant is used (which has the explicit Nexus UUID).

I could pass the Nexus UUID back out through the NeedsHandoff enum? Gave this a shot in ef27f21, removed this error case.

println!("Update to {version} complete");
}
SchemaAction::NeedsHandoff | SchemaAction::Refuse => {
println!("Cannot update to version {version}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, these should be impossible, right? Here I'd suggest being more explicit and reporting this as some kind of internal error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 2760930

}

/// Registers a Nexus instance as having active access to the database
pub async fn database_nexus_access_insert(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this non-pub?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Checks if any db_metadata_nexus records exist in the database using an existing connection
pub async fn database_nexus_access_any_exist_on_connection(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this non-pub?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/// Checks if any db_metadata_nexus records exist in the database
pub async fn database_nexus_access_any_exist(&self) -> Result<bool, Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-pub?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// Describes how a consumer may want to react to schema and access
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum ConsumerPolicy {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wasn't thinking of the current MUPdate-based update process.

How does this PR affect that? It seems like this PR does change Nexus to automatically try to update the schema?


/// Describes what should be done with a schema
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum SchemaAction {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatastoreSetupAction?

SchemaAction::NeedsHandoff | SchemaAction::Refuse => {
println!("Cannot update to version {version}")
DatastoreSetupAction::Refuse => {
println!("Refusing to update to version {version}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would this be? (I'm imagining the support person seeing this message and not knowing what this means or what to do next.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One example could be if the "version the schema updater wants to upgrade to" is older than the observed version on disk (e.g., really old schema-updater, newer deployment). Running the schema-updater ls command should make this immediately clear.

@smklein
Copy link
Collaborator Author

smklein commented Aug 27, 2025

Just added population of the db_metadata_nexus records in c5e4509, with tests in 403b1d7.

next step: Splitting this into the smaller PRs, as we discussed in the update sync today.

smklein added a commit that referenced this pull request Aug 29, 2025
…es (#8924)

Split off of #8845

Creates the schema, ensures it stays up-to-date. Does not attempt to
read it.

First part of #8501: adding schema for records, writing them. Not yet
reading these records.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

schema update: orchestration of the handoff from old to new Nexus
2 participants