Skip to content

Spin Improvement Proposal: Modularizing spin.toml #3073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 264 additions & 0 deletions docs/content/sips/022-modularizing-spin-manifests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
title = "SIP 022 - Modularizing spin.toml"
template = "main"
date = "2025-03-31T12:00:00Z"

---

Summary: Support moving component definitions into separate `spin.toml` files, similar to Cargo workspaces.

Owner(s): [[email protected]](mailto:[email protected])

Created: March 31, 2025

# Background

As Spin applications grow in complexity, their `spin.toml` manifest files can become large and unwieldy. Large applications with many triggers and components can result in manifest files that span hundreds of lines, making them difficult to navigate, understand, and maintain. It’s very hard to keep related settings grouped, making the application architecture harder to follow.

The current monolithic nature of `spin.toml` files also means that they tend to be very conservative when it comes to adding additional features that’d require additions to the manifest. This proposal is directly motivated by the desire to add features that’d make `spin.toml` files in their current form substantially more unwieldy. Those will be described in separate proposals.

Besides specific current goals, moving to a modular manifest format will bring Spin into alignment with related other formats, such as Rust’s [Cargo.toml](https://doc.rust-lang.org/cargo/reference/workspaces.html) or npm’s [package.json](https://docs.npmjs.com/cli/v11/using-npm/workspaces).

Modularizing `spin.toml` files has come up previously, with an attempt at implementing them [here](https://github.com/spinframework/spin/pull/2396).

# Proposal

`spin.toml` files contain multiple different kinds of items: application-level metadata, variables, trigger definitions, and component definitions. I propose to support modularizing manifests in the following way:

- Manifests can only be nested one level deep: an application can have a single top-level manifest, and zero or more direct sub-manifests, which cannot have any sub-manifests themselves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for restricting to 1-level nesting?

- Application-level metadata must fully be contained in the application’s top-level manifest
- Triggers must be defined in the top-level manifest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to say more about the reason for this restriction.

I'm getting the sense that you have a mental model for a concrete thing that a sub-manifest represents, that motivates some of the decisions here? If so it would be useful to talk about it here, even if it is distant and aspirational!

- Variables can be contained in all manifests
- A variable defined in the top-level manifest is visible in all manifests
- A variable defined in a sub-manifest is only visible in that manifest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to elaborate on this scoping rule, as it requires Spin to track visibility in a way we currently don't. At the moment, spin up assembles everything into a single lockfile, which is passed to the trigger: components' template values are resolved when the trigger loads the lockfile. Enforcing visibility rules seems like it would mean either multiple lockfiles or tracking provenance in the lockfile. Maybe we can get around it by munging names, but then we need to map them back again to get values from providers.

Another facet of this (and, I'd guess, part of the motivation) is what happens if two sub-manifests define the same variable name. The proposed scoping rule appears to allow this, which is good because it means I can pull in off-the-shelf sub-manifests and not worry about them clashing. But again we need to better understand how we envisage representing that name in the lockfile.

In theory, I gues we could unify variables with the same name, since they will end up backing onto the same provider-level value (e.g. environment variable or Vault key). But in practice different defaults will make this tricky.

Copy link
Contributor

@fibonacci1729 fibonacci1729 Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is our collective appetite 🌯 for revving the locked app format to codify these visibility rules ? In any case, I agree that this rule needs to be ironed out a bit.

Copy link
Contributor

@fibonacci1729 fibonacci1729 Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my vote would be to drop these scoping rules for now and require variables be defined in the root manifest.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fibonacci1729 I think @tschneidereit is working on something that elaborates on his long-term "north star" thinking, which I believe is what is motivating these rules. I feel we should hold off until we see that and can decide whether to aim for it directly (at the cost of possibly revving the lockfile) or to do something expedient to solve the immediate problem (at the expense of possible rework and possible compatibility quirks when we do move towards the aspiration).

Specifically I get hints from this document of a vision of sub-manifested components as units of packaging, so that this becomes more than spicy lexical inclusion. And if so (big if!), there is some conceptual stuff we need to understand about sandboxing and permissions and scoping (e.g. can a component sniff values from the top level without an explicit grant, how do we namespace stuff so components don't need to coordinate on naming, etc.). I may be over-inferring though!

- Components can be defined in the top-level, and in sub-manifests
- Only a single component can be defined in each sub-manifest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to say more about the reason for this restriction.

(ETA: you do, but much later - consider a forward link.)


## Example

To give a high-level overview of the proposal’s impact before discussing the details of the syntax, let’s look at an [existing](https://github.com/fermyon/ai-examples/blob/main/sentiment-analysis-ts/spin.toml) `spin.toml` file first:

```toml
# (Source: https://github.com/fermyon/ai-examples/blob/main/sentiment-analysis-ts/spin.toml)
spin_manifest_version = 2

[application]
name = "sentiment-analysis"
version = "0.1.0"
authors = ["Caleb Schoepp <[email protected]>"]
description = "A sentiment analysis API that demonstrates using LLM inference and KV stores together"

[[trigger.http]]
route = "/api/..."
component = "sentiment-analysis"

[component.sentiment-analysis]
source = "target/spin-http-js.wasm"
allowed_outbound_hosts = []
exclude_files = ["**/node_modules"]
key_value_stores = ["default"]
ai_models = ["llama2-chat"]

[component.sentiment-analysis.build]
command = "npm run build"
watch = ["src/**/*", "package.json", "package-lock.json"]

[[trigger.http]]
route = "/..."
component = "ui"

[component.ui]
source = { url = "https://github.com/fermyon/spin-fileserver/releases/download/v0.0.1/spin_static_fs.wasm", digest = "sha256:650376c33a0756b1a52cad7ca670f1126391b79050df0321407da9c741d32375" }
allowed_outbound_hosts = []
files = [{ source = "../sentiment-analysis-assets", destination = "/" }]

[variables]
kv_explorer_user = { required = true }
kv_explorer_password = { required = true }

[[trigger.http]]
component = "kv-explorer"
route = "/internal/kv-explorer/..."

[component.kv-explorer]
source = { url = "https://github.com/fermyon/spin-kv-explorer/releases/download/v0.10.0/spin-kv-explorer.wasm", digest = "sha256:65bc286f8315746d1beecd2430e178f539fa487ebf6520099daae09a35dbce1d" }
allowed_outbound_hosts = ["redis://*:*", "mysql://*:*", "postgres://*:*"]
# add or remove stores you want to explore here
key_value_stores = ["default"]

[component.kv-explorer.variables]
kv_credentials = "{{ kv_explorer_user }}:{{ kv_explorer_password }}"
```

Here’s what a modular version of this manifest could look like:

```toml
# [project]/spin.toml

spin_manifest_version = ?

[application]
name = "sentiment-analysis"
version = "0.1.0"
authors = ["Caleb Schoepp <[email protected]>"]
description = "A sentiment analysis API that demonstrates using LLM inference and

# Components defined in sub-manifests must be included in this list.
# Entries are treated as folder names, relative to the folder containing the application manifest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the directory would be laid out like follows?

├── spin.toml
├── sentiment-analysis
 │   └─── spin.toml
└── kv-explorer
     └─── spin.toml

Is there a way we could allow for file exact paths, too?

components = ["sentiment-analysis", "kv-explorer"]

[[trigger.http]]
route = "/api/..."
component = "sentiment-analysis"

[[trigger.http]]
route = "/..."
component = "ui"

[[trigger.http]]
component = "kv-explorer"
route = "/internal/kv-explorer/..."

# The UI component is defined inline, because it's small.
[component.ui]
source = { url = "https://github.com/fermyon/spin-fileserver/releases/download/v0.0.1/spin_static_fs.wasm", digest = "sha256:650376c33a0756b1a52cad7ca670f1126391b79050df0321407da9c741d32375" }
allowed_outbound_hosts = []
files = [{ source = "../sentiment-analysis-assets", destination = "/" }]
```

```toml
# [project]/sentiment-analysis/spin.toml

# Sections and keys don't need to be scoped explicitly.
[component]
name = "sentiment-analysis"
description = "Component-specific description"
# Certain keys can be inherited from the application manifest
authors.application = true
version.application = true
source = "target/spin-http-js.wasm"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relative to sub-manifest directory not manifest directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird to me to have this be relative to the sub manifest. Shouldn't this already be defined in the sub manifest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kate-goldenring Source being relative to the sub-manifest allows the sub-manifest directory to be portable and self-sufficient. What's the alternative? The only one I can think of is "relative to the top-level manifest" which would require the sub-manifest to make assumptions about its own path.

...or am I misunderstanding the comment? "already be defined in the sub-manifest" makes me worry we are talking at cross purposes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is going to be relative to the component's directory, why not just define source path there instead of in the top level manifest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am utterly confused, sorry. We are not defining source path (for a modularised component) in the top level manifest. We are definining it in the sub-manifest (which is, presumably, in the component directory).

allowed_outbound_hosts = []
exclude_files = ["**/node_modules"]
key_value_stores = ["default"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permissions is an interesting one and this is another case where I'd find it useful to hear your long-term vision for this. If this is purely, purely, to help a developer organise a big file, then this is not a worry. But talking about "authors" and "version" fields implies that you envision an ecosystem and distribution mechanism for sub-manifests (dare I call them... components). And in that case it maybe feels like the application should be in charge of granting permissions. I don't want to upgrade "EvilComponent" and discover too late that it's invited itself to my production database...!

ai_models = ["llama2-chat"]

# Additional sections don't need to be namespaced under [component]
[build]
command = "npm run build"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly worth highlighting that this is implicitly relative to the sub-manifest directory not the manifest directory (and similarly build.workdir is relative to the sub-manifest directory).

watch = ["src/**/*", "package.json", "package-lock.json"]
```

```toml
# [project]/kv-explorer/Spin.toml

[component]
name = "kv-explorer"
source = { url = "https://github.com/fermyon/spin-kv-explorer/releases/download/v0.10.0/spin-kv-explorer.wasm", digest = "sha256:65bc286f8315746d1beecd2430e178f539fa487ebf6520099daae09a35dbce1d" }
allowed_outbound_hosts = ["redis://*:*", "mysql://*:*", "postgres://*:*"]
# add or remove stores you want to explore here
key_value_stores = ["default"]

# Locally defined variables don't need to be namespaced manually.
[variables]
user = { required = true }
password = { required = true }

# Variables exposed to content (this needs bikeshedding!)
[variables.exposed]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe component.variables since that's how it's done in top level components?

kv_credentials = "{{ user }}:{{ password }}"
```

As can be seen in this example, this modular structure allows us to not only isolate aspects of individual components in their respective manifests, but also reduce repetition, since keys and sections don’t have to be explicitly scoped to the component anymore.

## Details

### Sub-manifests for components

In this proposal, only components (and variables only accessible locally) can be defined in sub-manifests; triggers still need to be defined in the top-level manifest.

There are multiple reasons for this constraint:

1. Triggers are application-level concerns; they describe the application’s structure and high-level organization. As such, keeping them “at the surface” seems right.
2. Trigger definitions tend to be very short, so moving them into sub-manifests individually as is done for components seems very boilerplate-y.
3. Trigger definitions don’t come with their own assets, so any folder containing a sub-manifest for them would only ever contain the manifest, and nothing else.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly triggers don't come with component-related config, but we do have an oddity at the moment where WAGI execution has to be specified on the trigger not the component even though it's the component interface that mandates it. (I think this is to allow different trigger to set different argvs on the same component. Not sure though.)

4. If, on the other hand, we were to allow defining triggers together with components in the same sub-manifest, that’d make it very hard to reason about the application’s structure—see the first point above.
1. It’s common to have multiple triggers using the same component, e.g. to expose it under multiple routes using multiple http triggers. That means we’d want to allow multiple trigger definitions per sub-manifest, making the previous point even stronger.

### One component per component manifest

Similar to other tools such as cargo or npm, under this proposal Spin would only support a single component to be defined in each sub-manifest. This allows reducing boilerplate in each manifest, because now sections such as `[build]` are unambiguously scoped and don’t need to be expanded to `[component.foo.build]`. It also allows grasping the application’s general structure just by looking at the application manifest, instead of having to look at each sub-manifest to find the full list of components.

*Note that this doesn’t preclude future additions for defining nested components visible only in the scope of an individual top-level component, and destined for composition into said top-level component.*

### Support for defining multiple components in the application manifest

This proposal would not take away support for defining components using the existing syntax; it’d purely add support for the modular approach. The reason is that there are component definitions that are very light-weight; moving them into separate folders would add boilerplate and cognitive overhead, not reduce it.

### Inheriting definitions from the application manifest

Similar to [crates in a Cargo workspace](https://doc.rust-lang.org/1.85.1/cargo/reference/workspaces.html#the-package-table), component manifests can inherit certain definitions from the application manifest, such as the version, description, and authors:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps say more about why you feel this would be useful? Cargo cares because crates are units of distribution and deployment.


```toml
# [project]/Spin.toml

[application]
version = "0.1.0"
...

# [project]/my-component/Spin.toml
...
# Inherit the `version` field from the application
version.application = true
```

### Handling of variables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a hard time understanding what make variables unique over other resources, such as environment variables, files, and kv stores. Can you explain why variables are component scoped but other resources are not?


Variables are among the most tricky aspects to sort out in this approach. The proposal aims to make it easy to use both application-wide and component-scoped variables. An additional design goal is to support adoption without changes to deployment pipelines and their handling of variables.

To achieve these goals, the proposal

- treats application-level variables as inheritable in the same way as the definitions mentioned above
- namespaces component-level variables by prefixing them with the component name when interpreting manifests

To illustrate this, consider these manifests:

```toml
# [project]/Spin.toml

...
[variables]
my_variable = { default = "my-value" }
...

components = ["my-component"]
```

```toml
# [project]/my-component/Spin.toml

...
# Inherit the `my_variable` variable from the application
[variables]
my_variable.application = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This seems like busywork.

my_other_variable = { default = "some-value" }
...
```

Combined, they’d be interpreted like this:

```toml
...
[variables]
my_variable = { default = "my-value" }
my_component_my_other_variable = { default = "some-value" }
...
```

Any uses in template strings are adapted to use the prefixed variable names.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this saying that Spin has to rewrite template strings to the new names as part of lockfile creation?

How does this affect providers? Do secrets in Vault need to use the scoped name?


### Exposing variables to content

Currently, manifests have the `[variables]` table for defining variables available to use in the manifest itself, e.g. as part of template strings. They also have `[component.name.variables]` tables for exposing variables to content.

In component manifests, the role of the latter is taken by the new `[variables.exposed]` table.

*Note: we should evaluate whether this table is really needed. Can we instead just expose all variables in a component manifest’s `[variables]` table to content directly?*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can. The term "variable" has two meanings:

  • A configuration knob exposed to the operator
  • A value exposed to the guest

The decoupling is not just for sandboxing, but so that configuration knobs are not required to line up with what the component sees. E.g. if I publish a component which looks at a "db-url" value, and you publish a component that looks at "db-host" and "db-port" values, the decoupling allows the app developer to define two knobs for the operator, which the application surfaces in different ways to meet different component expectations.

This may have turned out to be generality that we don't need (or won't need until a broader component ecosystem evolves and uses wasi-config), but I'd be cautious about backing it out, and doubly cautious about backing it out for the special case of sub-manifests. It may be that sub-manifest local variables are so tightly internally coupled to their components that it's a reasonable call though - just want to be cautious in the evaluation!

Loading