Skip to content

Fix subsets inheriting service chaining requirement from whole app #3136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

itowlson
Copy link
Collaborator

Fixes #2915 and #3088

This explores the approach of putting host requirements on components instead of on the app, meaning that app splitting and multi-trigger Just Work without any shenanigans.

The approach is much nicer than the previous attempt in #3093, but needs a bit more testing. The error behavious for existing trigger binaries is not lovely but that may not be possible to control since those binaries are already out there (and fixes itself when the triggers are rebuilt against these versions of the Spin crates).

@itowlson itowlson force-pushed the host-reqs-on-components-waah-waah-waah branch from 54361c4 to 5658cca Compare May 20, 2025 00:03
@itowlson itowlson marked this pull request as ready for review May 20, 2025 00:04
@itowlson itowlson requested review from lann and kate-goldenring May 20, 2025 00:04
@itowlson
Copy link
Collaborator Author

This adds a must-understand entry to the lockfile. Thinking about what this will affect:

CNCF projects:

  • SpinKube: will likely need to update to work with OCIs pushed by versions that incorporate this
  • Trigger plugins: (cron, sqs, mqtt, command): should update to work with revved Spin CLI, but the situations in which un-updated triggers fail will be ones in which they fail today (albeit they may now fail with worse error messages), so I don't think this risks a huge regression - and once updated they will work in situations where un-updated ones fail

Vendor projects:

  • Fermyon (Cloud, FWF): services may take this update when suitable, but client plugins must not update until the services understand the new must-understand

I don't think this needs a lockfile version bump - the existing must-understand framework handles it, although the error messaging could do with improvement. Open to being persuaded otherwise of course though!

As well as the new integration test, I tested by rebuilding the cron trigger against the new Spin crates, and it appeared to run correctly in apps that included service-chained HTTP. Not sure about the practicalities of adding that to the integration tests though.

@itowlson
Copy link
Collaborator Author

@lann @kate-goldenring sorry for the badger but if one of you could take a look I would really appreciate it. Thanks!

@lann
Copy link
Collaborator

lann commented May 28, 2025

The error behavior for existing trigger binaries is not lovely

This adds a must-understand entry to the lockfile.

I wonder if we could avoid these through the power of Making It Even More Complicated:

  • Add per-component service chaining host requirement
    • Any requirements added here are also added to host requirements (!)
  • Complexify ensure_needs_only:
    • Check component host requirements first
      • Add all of these to "all component host requirements" set
    • Check app host requirements, skipping any in "all component host requirements" set
  • Don't add must_understand component_host_requirements; they should be conservatively enforced by being added to host requirements
  • [future] Bump spin_lock_version and shove all of this logic in a deep dark "v1 compat" hole

@itowlson
Copy link
Collaborator Author

@lann I don't think that works without upgrading the triggers (which are where ensure_needs_only lives); and if we are allowed to upgrade the triggers then the problem goes away.

(I am also not sure it works anyway. Or I am misunderstanding it. Consider an app with HTTP and cron components. If a HTTP component marks itself with a SC host requirement, that bubbles up to mark the application with a SC host requirement. Then the cron trigger looks at the cron components, does not see a SC host requirement on any of them, so does not add SC to the "all component host reqs" bucket. Then looks at application host reqs and finds "SC required" which has not been ticked off at the component level. So bam. But I feel like I must be mis-reading what you're proposing.)

@lann
Copy link
Collaborator

lann commented May 28, 2025

I don't think that works without upgrading the triggers (which are where ensure_needs_only lives); and if we are allowed to upgrade the triggers then the problem goes away.

I'm only trying to help with the "worse error message" part here, not the "need to upgrade triggers" part; I don't think it's any worse from that perspective? 🤔

An old trigger should see essentially the same thing it would have seen before, plus the new component host requirements field which it should dutifully ignore.

I am also not sure it works anyway.

Ah, I wasn't thinking carefully enough about app splitting, but I think it can be salvaged:

  • Collect component host requirements from all components in the app -> "all component host requirements"
  • Check app host requirements, skipping any in "all component host requirements" set
  • Check component host requirements for the components you actually care about

@itowlson
Copy link
Collaborator Author

Okay, I think I get you now - thanks! Let me try to echo it back to make sure. The proposal is that enlightened triggers figure out which application host requirements are actually component host requirements that got bubbled up, and ignore them: they then validate on any remaining app host reqs, and any component host reqs from components in their subset. Whereas unenlightened triggers just see the (bubbled) application host requirement and validate against them.

This sounds like it will work, but I'm not sure it's worth the candle what I hope will be a transitional phase. I suppose we could make component host reqs must-understand once most our "known" triggers have completed that transition, and then get rid of the complexity.

For context, the unlovely error from un-upgraded triggers is something like "unknown untagged enum variant." It's unlovely but I'm not sure how much effort to invest in getting rid of it. The real fix for this is to make must-understands into strings instead of an enum - again, once that flows through to triggers this will all become a non-issue.

@itowlson
Copy link
Collaborator Author

(to be clear, thank you for brainstorming ideas and I'm happy to go ahead with this if you think it's worth the extra complexity)

@lann
Copy link
Collaborator

lann commented May 28, 2025

I don't mind not doing this; it would definitely throw another flapjack on our towering stack of complexity.

@itowlson itowlson merged commit 40e88fa into spinframework:main May 28, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Local Service Chaining blocks spin start Feature check is not properly done at trigger level
2 participants