Initialization and Linearization

We have a bit of a fundamental issue with running initialization in `linearization_function` and the like. Note that all of this discussion is in the context of #3347, on the assumption it is merged. With that PR, any initial conditions of the form `u0 = [x =>1, y => 2]` will be turned into the equations `x ~ x0_tmp, y ~ y0_tmp` in the initialization system. The new parameters `x0_tmp` and `y0_tmp` are only present in the initialization system, and take whatever value they have in the `ODEProblem`. This allows mutating `prob[x] = 2.0` and automatically getting the equivalent of `x ~ 2.0` in initialization.

As with all good things, there's a catch. This transformation only happens for variables that are unknowns in the system. If `z` is an observed variable (present as an equation of the form `z ~ ...` in `observed(sys)`) then `u0 = [z => 1.0]` will be hardcoded in the initialization system as `z ~ 1.0`. This is because even if `z0_tmp` existed, where would it get its value from in the `ODEProblem`? `z` isn't in the state vector, it is calculated from the state vector. We could add `z0_tmp` as a parameter to the `ODESystem`, but the user doesn't know about this parameter. It's going to have a name with some unicode to prevent collisions, which automatically makes it difficult to type. Since systems are immutable, this would also mean that the new parameter is only present in the system stored in `prob.f.sys`, not the system the user has, the one they passed to `ODEProblem`. We'd also need to be very careful about how this parameter is managed, because it would be very easy to make it so that the parameters in `prob.f.sys` have different indexes than `sys` and thus getter functions made using `sys` won't work on the actual value providers, which is unacceptable.

Where does linearization come into this? Well, for `ODEProblem` if the user wants to change the initial condition for `z`, they can simply call `remake` and pass `z => 2.0`. This will rebuild the initialization system accordingly. This approach isn't an option for linearization, because `linearization_function` is meant to return a function that can be called with multiple operating points and run very quickly on account of being called very often. Effectively, `linearization_function` returns a `LinearizationProblem` for which the intended use case is the equivalent of calling symbolic `remake` many times in a loop. "Many" can be upwards of 6000 times on relatively small problems. A challenge with UX here is that the system the user passes to `linearization_function` is not actually the system that is used for code generation. It goes through `io_preprocessing` to mark some variables as inputs/outputs and hence structurally change the system. This is even worse in e.g. `sensitivity_function`, which adds equations and variables before calling `linearization_function`. As a result, the user has no control and almost no intuition or feedback for which variables are observed and which are unknowns in the system used to build initialization. This is of course bad: the user should be able to tune what they want without being concerned about whether it's an unknown or not, and currently they can't.

One possible way to handle this is the same `z0_tmp` approach, but since MTK has control over the object returned from `linearization_function` (instead of lower-level SciML infrastructure as is the case with `ODEProblem`) it can update the "`LinearizationProblem`" appropriately. The downside here is that
1. The user has to pass an operating point containing all the variables they want to tune to `linearization_function` so the appropriate `var ~ var0_tmp` variables are created.
2. The object returned from `linearization_function` is stateful in a much more indirect way. If the user passes `z => 2.0` as part of `op` to `linearize(sys, lin_fun; op)`, it will mutate `lin_fun`. The next time `linearize` is called, `z0_tmp` will have the value of `2` if the user doesn't pass a new value for `z`. If they called `linearize` in the reverse order, this wouldn't be the case and they'd get a different result out. Moreover, the user _doesn't know_ whether a variable passed to `op` will do this mutation so they can't schedule their `linearize` calls appropriately either.

@baggepinnen had phrased this problem nicely, which I'll paraphrase here: SciML follows a `prob[var] = value; solve(prob)` user workflow. Linearization wants a `solve(prob, values)` workflow which is antithetical to how everything else is designed.

One way to look at this is if we were to solve this problem for `ODEProblem` (allow changing initial conditions of observed variables without `remake`) how would we do it? What would the user API look like?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Initialization and Linearization #3352

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Initialization and Linearization #3352

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions