While for simple numerical ODEs choosing an initial condition can be an easy affair, with ModelingToolkit's more general differential-algebraic equation (DAE) system there is more care needed due to the flexibility of the solver state. In this tutorial we will walk through the functionality involved in initialization of ODESystem and the diagnostics to better understand and debug the initialization problem.
Before getting started, let's do a brief walkthrough of the mathematical principles of initialization of DAE systems. Take a DAE written in semi-explicit form:
where x
are the differential variables and y
are the algebraic variables.
An initial condition u0 = [x(t_0) y(t_0)]
is said to be consistent if
g(x(t_0),y(t_0),t_0) = 0
.
For ODEs, this is trivially satisfied. However, for more complicated systems it may not be easy to know how to choose the variables such that all of the conditions are satisfied. This is even more complicated when taking into account ModelingToolkit's simplification engine, given that variables can be eliminated and equations can be changed. If this happens, how do you know how to initialize the system?
To illustrate how to perform the initialization, let's take a look at the Cartesian pendulum:
using ModelingToolkit, OrdinaryDiffEq, Plots
using ModelingToolkit: t_nounits as t, D_nounits as D
@parameters g
@variables x(t) y(t) [state_priority = 10] λ(t)
eqs = [D(D(x)) ~ λ * x
D(D(y)) ~ λ * y - g
x^2 + y^2 ~ 1]
@mtkbuild pend = ODESystem(eqs, t)
While we defined the system using second derivatives and a length constraint,
the structural simplification system improved the numerics of the system to
be solvable using the dummy derivative technique, which results in 3 algebraic
equations and 2 differential equations. In this case, the differential equations
with respect to y
and D(y)
, though it could have just as easily have been
x
and D(x)
. How do you initialize such a system if you don't know in advance
what variables may defined the equation's state?
To see how the system works, let's start the pendulum in the far right position,
i.e. x(0) = 1
and y(0) = 0
. We can do this by:
prob = ODEProblem(pend, [x => 1, y => 0], (0.0, 1.5), [g => 1], guesses = [λ => 1])
This solves via:
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
and we can check it satisfies our conditions via:
conditions = getfield.(equations(pend)[3:end], :rhs)
[sol[conditions][1]; sol[x][1] - 1; sol[y][1]]
Notice that we set [x => 1, y => 0]
as our initial conditions and [λ => 1]
as our guess.
The difference is that the initial conditions are required to be satisfied, while the
guesses are simply a guess for what the initial value might be. Every variable must have
either an initial condition or a guess, and thus since we did not know what λ
would be
we set it to 1 and let the initialization scheme find the correct value for λ. Indeed,
the value for λ
at the initial time is not 1:
sol[λ][1]
We can similarly choose λ = 0
and solve for y
to start the system:
prob = ODEProblem(pend, [x => 1, λ => 0], (0.0, 1.5), [g => 1], guesses = [y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
or choose to satisfy derivative conditions:
prob = ODEProblem(
pend, [x => 1, D(y) => 0], (0.0, 1.5), [g => 1], guesses = [λ => 0, y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Notice that since a derivative condition is given, we are required to give a
guess for y
.
We can also directly give equations to be satisfied at the initial point by using
the initialization_eqs
keyword argument, for example:
prob = ODEProblem(pend, [x => 1], (0.0, 1.5), [g => 1], guesses = [λ => 0, y => 1],
initialization_eqs = [y ~ 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Additionally, note that the initial conditions are allowed to be functions of other variables and parameters:
prob = ODEProblem(
pend, [x => 1, D(y) => g], (0.0, 3.0), [g => 1], guesses = [λ => 0, y => 1])
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
For this system we have 3 conditions to satisfy:
conditions = getfield.(equations(pend)[3:end], :rhs)
when we initialize with
prob = ODEProblem(pend, [x => 1, y => 0], (0.0, 1.5), [g => 1], guesses = [y => 0, λ => 1])
we have two extra conditions to satisfy, x ~ 1
and y ~ 0
at the initial point. That gives
5 equations for 5 variables and thus the system is well-formed. What happens if that's not the
case?
prob = ODEProblem(pend, [x => 1], (0.0, 1.5), [g => 1], guesses = [y => 0, λ => 1])
Here we have 4 equations for 5 unknowns (note: the warning is post-simplification of the
nonlinear system, which solves the trivial x ~ 1
equation analytical and thus says
3 equations for 4 unknowns). This warning thus lets you know the system is underdetermined
and thus the solution is not necessarily unique. It can still be solved:
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
and the found initial condition satisfies all constraints which were given. In the opposite direction, we may have an overdetermined system:
prob = ODEProblem(
pend, [x => 1, y => 0.0, D(y) => 0], (0.0, 1.5), [g => 1], guesses = [λ => 1])
Can that be solved?
sol = solve(prob, Rodas5P())
plot(sol, idxs = (x, y))
Indeed since we saw D(y) = 0
at the initial point above, it turns out that this solution
is solvable with the chosen initial conditions. However, for overdetermined systems we often
aren't that lucky. If the set of initial conditions cannot be satisfied, then you will get
a SciMLBase.ReturnCode.InitialFailure
:
prob = ODEProblem(
pend, [x => 1, y => 0.0, D(y) => 2.0, λ => 1], (0.0, 1.5), [g => 1], guesses = [λ => 1])
sol = solve(prob, Rodas5P())
What this means is that the initial condition finder failed to find an initial condition.
While this can be sometimes due to numerical error (which is then helped by picking guesses closer
to the correct value), most circumstances of this come from ill-formed models. Especially
if your system is overdetermined and you receive an InitialFailure, the initial conditions
may not be analytically satisfiable!. In our case here, if you sit down with a pen and paper
long enough you will see that λ = 0
is required for this equation, but since we chose
λ = 1
we end up with a set of equations that are impossible to satisfy.
!!! note
If you would prefer to have an error instead of a warning in the context of non-fully
determined systems, pass the keyword argument `fully_determined = true` into the
problem constructor. Additionally, any warning about not being fully determined can
be suppressed via passing `warn_initialize_determined = false`.
To get a better sense of the initialization system and to help debug it, you can construct the initialization system directly. The initialization system is a NonlinearSystem which requires the system-level information and the additional nonlinear equations being tagged to the system.
isys = generate_initializesystem(pend, u0map = [x => 1.0, y => 0.0], guesses = [λ => 1])
We can inspect what its equations and unknown values are:
equations(isys)
unknowns(isys)
Notice that all initial conditions are treated as initial equations. Additionally, for systems with observables, those observables are too treated as initial equations. We can see the resulting simplified system via the command:
isys = structural_simplify(isys; fully_determined = false)
Note fully_determined=false
allows for the simplification to occur when the number of equations
does not match the number of unknowns, which we can use to investigate our overdetermined system:
isys = ModelingToolkit.generate_initializesystem(
pend, u0map = [x => 1, y => 0.0, D(y) => 2.0, λ => 1], guesses = [λ => 1])
isys = structural_simplify(isys; fully_determined = false)
equations(isys)
unknowns(isys)
observed(isys)
After simplification we see that we have 5 equatinos to solve with 3 variables, and the system that is given is not solvable.
To inspect the numerics of the initialization problem, we can use the InitializationProblem
constructor which acts just like an ODEProblem
or NonlinearProblem
constructor, but
creates the special initialization system for a given sys
. This is done as follows:
iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
[x => 1, y => 0.0, D(y) => 2.0, λ => 1], [g => 1], guesses = [λ => 1])
We can see that because the system is overdetermined we receive a NonlinearLeastSquaresProblem, solvable by NonlinearSolve.jl. Using NonlinearSolve we can recreate the initialization solve directly:
using NonlinearSolve
sol = solve(iprob)
!!! note
For more information on solving NonlinearProblems and NonlinearLeastSquaresProblems,
check out the [NonlinearSolve.jl tutorials!](https://docs.sciml.ai/NonlinearSolve/stable/tutorials/getting_started/).
We can see that the default solver stalls
sol.stats
after doing many iterations, showing that it tried to compute but could not find a valid solution. Trying other solvers:
sol = solve(iprob, GaussNewton())
gives the same issue, indicating that the chosen initialization system is unsatisfiable. We can check the residuals:
sol.resid
to see the problem is not equation 2 but other equations in the system. Meanwhile, changing some of the conditions:
iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
[x => 1, y => 0.0, D(y) => 0.0, λ => 0], [g => 1], guesses = [λ => 1])
gives a NonlinearLeastSquaresProblem which can be solved:
sol = solve(iprob)
sol.resid
In comparison, if we have a well-conditioned system:
iprob = ModelingToolkit.InitializationProblem(pend, 0.0,
[x => 1, y => 0.0], [g => 1], guesses = [λ => 1])
notice that we instead obtained a NonlinearSystem. In this case we have to use different solvers which can take advantage of the fact that the Jacobian is square.
sol = solve(iprob)
sol = solve(iprob, TrustRegion())
Let's take a Lotka-Volterra system:
@variables x(t) y(t) z(t)
@parameters α=1.5 β=1.0 γ=3.0 δ=1.0
eqs = [D(x) ~ α * x - β * x * y
D(y) ~ -γ * y + δ * x * y
z ~ x + y]
@named sys = ODESystem(eqs, t)
simpsys = structural_simplify(sys)
tspan = (0.0, 10.0)
Using the derivative initializations, we can set the ODE to start at the steady state by initializing the derivatives to zero:
prob = ODEProblem(simpsys, [D(x) => 0.0, D(y) => 0.0], tspan, guesses = [x => 1, y => 1])
sol = solve(prob, Tsit5(), abstol = 1e-16)
Notice that this is a "numerical zero", not an exact zero, and thus the solution will leave the steady state in this instance because it's an unstable steady state.
Additionally, notice that in this setup we have an observable z ~ x + y
. If we instead know the
initial condition for the observable we can use that directly:
prob = ODEProblem(simpsys, [D(x) => 0.0, z => 2.0], tspan, guesses = [x => 1, y => 1])
sol = solve(prob, Tsit5())
We can check that indeed the solution does satisfy that D(x) = 0 at the start:
sol[α * x - β * x * y]
plot(sol)