Description
We have a bit of a fundamental issue with running initialization in linearization_function
and the like. Note that all of this discussion is in the context of #3347, on the assumption it is merged. With that PR, any initial conditions of the form u0 = [x =>1, y => 2]
will be turned into the equations x ~ x0_tmp, y ~ y0_tmp
in the initialization system. The new parameters x0_tmp
and y0_tmp
are only present in the initialization system, and take whatever value they have in the ODEProblem
. This allows mutating prob[x] = 2.0
and automatically getting the equivalent of x ~ 2.0
in initialization.
As with all good things, there's a catch. This transformation only happens for variables that are unknowns in the system. If z
is an observed variable (present as an equation of the form z ~ ...
in observed(sys)
) then u0 = [z => 1.0]
will be hardcoded in the initialization system as z ~ 1.0
. This is because even if z0_tmp
existed, where would it get its value from in the ODEProblem
? z
isn't in the state vector, it is calculated from the state vector. We could add z0_tmp
as a parameter to the ODESystem
, but the user doesn't know about this parameter. It's going to have a name with some unicode to prevent collisions, which automatically makes it difficult to type. Since systems are immutable, this would also mean that the new parameter is only present in the system stored in prob.f.sys
, not the system the user has, the one they passed to ODEProblem
. We'd also need to be very careful about how this parameter is managed, because it would be very easy to make it so that the parameters in prob.f.sys
have different indexes than sys
and thus getter functions made using sys
won't work on the actual value providers, which is unacceptable.
Where does linearization come into this? Well, for ODEProblem
if the user wants to change the initial condition for z
, they can simply call remake
and pass z => 2.0
. This will rebuild the initialization system accordingly. This approach isn't an option for linearization, because linearization_function
is meant to return a function that can be called with multiple operating points and run very quickly on account of being called very often. Effectively, linearization_function
returns a LinearizationProblem
for which the intended use case is the equivalent of calling symbolic remake
many times in a loop. "Many" can be upwards of 6000 times on relatively small problems. A challenge with UX here is that the system the user passes to linearization_function
is not actually the system that is used for code generation. It goes through io_preprocessing
to mark some variables as inputs/outputs and hence structurally change the system. This is even worse in e.g. sensitivity_function
, which adds equations and variables before calling linearization_function
. As a result, the user has no control and almost no intuition or feedback for which variables are observed and which are unknowns in the system used to build initialization. This is of course bad: the user should be able to tune what they want without being concerned about whether it's an unknown or not, and currently they can't.
One possible way to handle this is the same z0_tmp
approach, but since MTK has control over the object returned from linearization_function
(instead of lower-level SciML infrastructure as is the case with ODEProblem
) it can update the "LinearizationProblem
" appropriately. The downside here is that
- The user has to pass an operating point containing all the variables they want to tune to
linearization_function
so the appropriatevar ~ var0_tmp
variables are created. - The object returned from
linearization_function
is stateful in a much more indirect way. If the user passesz => 2.0
as part ofop
tolinearize(sys, lin_fun; op)
, it will mutatelin_fun
. The next timelinearize
is called,z0_tmp
will have the value of2
if the user doesn't pass a new value forz
. If they calledlinearize
in the reverse order, this wouldn't be the case and they'd get a different result out. Moreover, the user doesn't know whether a variable passed toop
will do this mutation so they can't schedule theirlinearize
calls appropriately either.
@baggepinnen had phrased this problem nicely, which I'll paraphrase here: SciML follows a prob[var] = value; solve(prob)
user workflow. Linearization wants a solve(prob, values)
workflow which is antithetical to how everything else is designed.
One way to look at this is if we were to solve this problem for ODEProblem
(allow changing initial conditions of observed variables without remake
) how would we do it? What would the user API look like?