Skip to content

Conversation

@bkirwi
Copy link
Contributor

@bkirwi bkirwi commented Oct 27, 2025

This flag has been enabled in prod and staging for over a year. At this point it seems unlikely that we go back to the resource-intensive and now-untested impl.

IIUC this flag was also a bit risky: since we use the flag to determine which operator to use, if config synced differently to different nodes, we might render different operators in different workers and cause some real weird timely bugs.

Motivation

txn-wal is a bit underowned at the moment, so it seems worthwhile to clean out the bits that we're not using.

This was also risky: if config synced differently to different nodes, we
might render different operators.
@bkirwi bkirwi marked this pull request as ready for review October 27, 2025 21:18
@bkirwi bkirwi requested review from a team and aljoscha as code owners October 27, 2025 21:18
Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@teskje
Copy link
Contributor

teskje commented Oct 28, 2025

IIUC this flag was also a bit risky: since we use the flag to determine which operator to use, if config synced differently to different nodes, we might render different operators in different workers and cause some real weird timely bugs.

Fwiw, this shouldn't be an issue because updates to ComputeState::worker_config are serialized with CreateDataflow commands in the compute protocol. If you know of a way how this might still go wrong, I'd be very interested to know about it! There are other feature flags that also affect dataflow rendering and rely on this working correctly.

Copy link
Member

@DAlperin DAlperin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I <3 deleting code

@bkirwi
Copy link
Contributor Author

bkirwi commented Oct 28, 2025

If you know of a way how this might still go wrong, I'd be very interested to know about it!

I suppose if all the writes to the config reliably happen-before the reads on all workers, then things are probably fine! (Hopefully storage's mechanism is similarly robust here.) I've recently squashed a handful of bugs on the Persist side where poorly timed config updates could cause things to get out of sync, so I'm a bit extra sensitive to this class of risk...

@bkirwi bkirwi merged commit d01f32e into MaterializeInc:main Oct 28, 2025
129 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants