Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Add NonLinearProgram Support to DiffOpt.jl #260

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

andrewrosemberg
Copy link

@andrewrosemberg andrewrosemberg commented Dec 6, 2024

This PR introduces a new module, NonLinearProgram, to extend DiffOpt.jl's functionality for differentiating nonlinear optimization problems (NLPs). The implementation integrates with JuMP-based nonlinear models and supports advanced derivative computation through custom evaluator and differentiation logic.

🆕 Features

  • Nonlinear Model Differentiation:

    • Forward and reverse differentiation for user-specified primal variables (focus_vars) and dual variables (focus_duals) with respect to a given set of parameters.
    • Extensible API for handling NLP-specific sensitivities using custom utilities.
  • Core Structures:

    • Cache: Stores primal variables, parameters, evaluator, and constraints for efficient reuse.
    • ForwCache: Holds results of forward differentiation, including sensitivities for specified variables.
    • ReverseCache: Holds results of reverse differentiation (implemented in this PR).
  • Integration with DiffOpt API:

    • Implements DiffOpt.AbstractModel for seamless compatibility with DiffOpt's API.
    • Provides forward_differentiate! and reverse_differentiate! functions for NLPs.

🔧 How It Works

  1. Custom Sensitivity Calculations:

    • Utilizes implicit function differentiation (similar to sIPOPT) to compute derivatives of the optimization problem with respect to selected parameters.
    • Allows differentiation of only specific variables and duals, offering a fine-grained approach compared to existing DiffOpt modules.
  2. Forward Differentiation:

    • Computes Jacobian products for selected primal and dual variables using the provided parameters.
    • Results are cached for efficient access through the DiffOpt API.
  3. Reverse Differentiation:

    • Computes transpose Jacobian products for selected primal and dual variables with respect to the parameters.
    • Results are cached in ReverseCache.

📜 Implementation Highlights

  • Forward Differentiation:

    DiffOpt.forward_differentiate!(model)
    • Calculates sensitivity of variables and duals w.r.t.params`.
    • Results are cached in ForwCache.
  • Reverse Differentiation:

    DiffOpt.reverse_differentiate!(model)
    • Computes transpose Jacobian products for parameters with respect to the primal and dual variables.
    • Results are cached in ReverseCache.
  • Custom Utilities:

    • Leverages create_evaluator, compute_sensitivity, and other utilities from nlp_utilities.jl for efficient derivative computation.

📋 Example Usage

Forward Differentiation

using DiffOpt
using JuMP
using Ipopt

# Define a JuMP model
model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p  MOI.Parameter(0.1))
@variable(model, x >= p)
@variable(model, y >= 0)
@objective(model, Min, x^2 + y^2)
@constraint(model, con, x + y >= 1)

# Solve
JuMP.optimize!(model)

# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), params[1], 0.2)

# forward differentiate
DiffOpt.forward_differentiate!(model)

# Retrieve sensitivities
dx = MOI.get(model, DiffOpt.ForwardVariablePrimal(), x)
dy = MOI.get(model, DiffOpt.ForwardVariablePrimal(), y)

Reverse Differentiation

# Set Primal Pertubations
MOI.set(model, DiffOpt.ReverseVariablePrimal(), x, 1.0)

# Reverse differentiation
DiffOpt.reverse_differentiate!(model)

# Retrieve reverse sensitivities (example usage)
dp= MOI.get(model, DiffOpt.ReverseParameter(), p)

🚧 TODO

  • Forward Differentiation: Implemented and validated for custom primal and dual variables.
  • Reverse Differentiation: Implemented to compute transpose Jacobian products.
  • Tests: Add comprehensive test cases for forward and reverse differentiation.
  • Documentation: Extend user documentation with more examples and edge case scenarios.

🛠 Future Work

  • Allow different methods to factorize the "M" matrix and solve the linear system of equations needed.

@andrewrosemberg andrewrosemberg changed the title [WIP] 🚀 Add NonLinearProgram Support to DiffOpt.jl 🚀 Add NonLinearProgram Support to DiffOpt.jl Dec 23, 2024
@joaquimg joaquimg mentioned this pull request Dec 27, 2024
4 tasks
Copy link

@frapac frapac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job @andrewrosemberg ! I appreciate the numerous unit-tests you have shipped with this PR.

I will try to run your code locally on small NLP instances, to assess how fast is the current implementation (my main concern is the total number of allocations). I will try also to test your code on parameterized OPF instances, to assess how far we can get in term of size.


################################################
#=
From sIpopt paper: https://optimization-online.org/2011/04/3008/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice!

Filling the off-diagonal elements of a sparse matrix to make it symmetric.
"""
function fill_off_diagonal(H)
ret = H + H'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we assume H is lower-triangular?


Filling the off-diagonal elements of a sparse matrix to make it symmetric.
"""
function fill_off_diagonal(H)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe constrain the input type? H::SparseMatrixCSC

)
sense_multiplier = sense_mult(model)
evaluator = model.cache.evaluator
y = [model.y[model.model.nlp_index_2_constraint[row].value] for row in rows]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a proper getter to access model.model.nlp_index_2_constraint?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. @joaquimg , do you ?

src/NonLinearProgram/nlp_utilities.jl Outdated Show resolved Hide resolved
# Partial derivative of the equality constraintswith wrt parameters
∇ₚC = jacobian[:, params_idx]

# M matrix
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is appreciated!

# [V_U 0 0 0 (X_U - X)]
# ]
len_w = num_vars + num_ineq
M = spzeros(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a cleaner way to build M and N, with less allocations. Let me try to build a MWE, we can assess later if we should address this in this PR or later.

"""
inertia_corrector_factorization(M::SparseMatrixCSC, num_w, num_cons; st=1e-6, max_corrections=50)

Inertia correction for the factorization of the KKT matrix. Sparse version.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The theoretical question is: do we need to run inertia correction at the optimum? It could be necessary if SOSC is not satisfied at the solution indeed, but in that case I am not sure the sIpopt's formula remains valid.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, but couldn't there be numerical errors?

Moreover, I believe the rest of DiffOpt just assumes that the necessary conditions hold (even if they don't). The thing is that their method doesn't fail when it doesn't hold, while sIpopt method appears to throw singular matrix error when either the conditions don't hold or when we have numerical errors. Therefore, I tried to make it consistent.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should log a warning if inertia correction is necessary (apologies if you're already doing this) or add an option allow_intertia_correction that defaults to true?

src/NonLinearProgram/nlp_utilities.jl Outdated Show resolved Hide resolved
src/NonLinearProgram/nlp_utilities.jl Outdated Show resolved Hide resolved
@frapac
Copy link

frapac commented Jan 8, 2025

@andrewrosemberg Playing with your branch right now, I must say I like DiffOpt's interface!

I did a dummy mistake when solving the problem HS15 and retrieving the sensitivity. A MWE is:

model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2]  MOI.Parameter.([100.0, 1.0]))
@variable(model, x[1:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 +  (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)

optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)

# # forward differentiate
DiffOpt.forward_differentiate!(model)

Δx = [
    MOI.get(model, DiffOpt.ForwardVariablePrimal(), var) for
    var in x
]

I forgot to specify the sensitivity for p[2], resulting in an error. I am wondering what would be the expected behavior here:

  • by default, should we set the sensitivity to 0.0 ?
  • or should we raise a proper error message if the user has forgotten to specify some sensitivities?

@frapac
Copy link

frapac commented Jan 8, 2025

Also, if I query the solution after calling forward_differentiate I get an error. Is this expected?

The MWE is:

model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2]  MOI.Parameter.([100.0, 1.0]))
@variable(model, x[1:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 +  (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)
optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)
MOI.set(model, DiffOpt.ForwardParameter(), p[2], 1.0)

# # forward differentiate
DiffOpt.forward_differentiate!(model)

JuMP.value.(x)

Output:

┌ Warning: The model has been modified since the last call to `optimize!` (or `optimize!` has not been called yet). If you are iteratively querying solution information and modifying a model, query all the results first, then modify the model.
└ @ JuMP ~/.julia/packages/JuMP/CU7H5/src/optimizer_interface.jl:1085
ERROR: LoadError: OptimizeNotCalled()

@andrewrosemberg
Copy link
Author

I forgot to specify the sensitivity for p[2], resulting in an error. I am wondering what would be the expected behavior here:

  • by default, should we set the sensitivity to 0.0 ?
  • or should we raise a proper error message if the user has forgotten to specify some sensitivities?

@frapac good question! I am fine either way. @joaquimg what would be the consistent approach given the rest of DIffOpt?

@andrewrosemberg
Copy link
Author

Also, if I query the solution after calling forward_differentiate I get an error. Is this expected?

I don't think this is the desired outcome haha Not sure how to ask MOI to ignore the forward attributes, but I will look into it. @joaquimg do you have an idea on how to avoid this?

@frapac
Copy link

frapac commented Jan 9, 2025

@andrewrosemberg Another issue I noted: if we are using non-standard indexing in JuMP (DenseAxisArray instead of Vector{VariableRef}) I got an error when retrieving the sensitivity

ERROR: LoadError: MethodError: no method matching similar(::Type{Vector{Float64}}, ::Tuple{UnitRange{Int64}})
The function `similar` exists, but no method is defined for this combination of argument types.

A MWE is:

model = Model(() -> DiffOpt.diff_optimizer(Ipopt.Optimizer))
@variable(model, p[1:2]  MOI.Parameter.([100.0, 1.0]))
# N.B: use non-standard indexing
@variable(model, x[0:2])
set_upper_bound.(x[1], 0.5)
@objective(model, Min, p[1] * (x[2] - x[1]^2)^2 +  (p[2] - x[1])^2)
@constraint(model, x[1] * x[2] >= 1.0)
@constraint(model, x[1] + x[2]^2 >= 0.0)
optimize!(model)
# set parameter pertubations
MOI.set(model, DiffOpt.ForwardParameter(), p[1], 1.0)
MOI.set(model, DiffOpt.ForwardParameter(), p[2], 1.0)

# # forward differentiate
DiffOpt.forward_differentiate!(model)

Δx = [
    MOI.get(model, DiffOpt.ForwardVariablePrimal(), var) for
    var in x
]

@frapac
Copy link

frapac commented Jan 9, 2025

Pushing it one step further, I have tried to differentiate an ACOPF instance using DiffOpt. I re-used the code in rosetta-opf, and ported it to DiffOpt. You can find a gist here.

Two observations:

  • on case9, DiffOpt is using inertia correction. This is not expected, as I am sure we satisfy all the good constraints qualifications in case9. I am not sure to understand why UMFPACK fails to factorize the KKT system.
  • on case1354pegase, I get an out-of-memory error. This is caused by this line, which allocates a dense matrix. I would suggest replacing diagm by SparseArrays.spdiagm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants