Skip to content

Conversation

@schmoelder
Copy link
Contributor

@schmoelder schmoelder commented Jul 11, 2024

This PR implements a Surrogate class for fitting GPs on existing data from optimizations.

Supersedes #45

To do

Note, there is some WIP in #152 which will also affect this PR. Should we already rebase onto that branch s.t. we can adapt the corresponding interfaces but risk some friction should there be some more changes upstream?

  • Fit GP on existing population
  • Provide method to evaluate objective functions, constraint functions etc.
  • Validate surrogate model accuracy with (simple) test cases
  • Test if optimizers converge to true solution using surrogate

Open questions

Alternative surrogate modeling approches

Currently, only GPs are implemented. However, other surrogate models can be envisioned (e.g. ANNs). To improve a more modular architecture, we could subclasses a SurrogateBase.

Follow-up projects

Once this is merged, we can also start working on other features which would improve or apply the surrogate models. Eventually, these should be moved to their own issues / PRs but for now, this is just a collection of ideas.

The interface

Currently, the SurrogateModel class somewhat mimics an OptimizationProblem as it also provides methods for estimating objectives, nonlinear constraints etc. To demonstrate this, compare the OptimizationProblem

sequenceDiagram
    User->>+OptimizationProblem: evaluate_objectives(x)
    OptimizationProblem->>+User: f(x)
Loading

with the SurrogateModel

sequenceDiagram
    User->>+SurrogateModel: estimate_objectives(x)
    SurrogateModel->>+User: f*(x)
Loading

However, it is important to note that the SurrogateModel will never provide all functionality of the OptimizationProblem, such as specifying variables, constraints etc. Hence, it cannot directly be used as an OptimizationProblem e.g. to interface with an Optimizer.

To me, this means we should rethink the architecture and consider what exactly does the SurrogateModel replace? In the context of an OptimizationProblem, I would say, it actually replaces the evaluation toolchain (that which returns the values x->f/g/m/...).

Consequently, we should consider moving the evaluation toolchain from the OptimizationProblem to its own module (which in the process would also make the OptimizationProblem less of a "god class" and would even allow reusing the toolchain in other places) and introduce an EvaluationInterface.

The architecture would then look something like the following:

sequenceDiagram
    User->>+OptimizationProblem: evaluate_objectives(x)
    OptimizationProblem->>+EvaluationInterface: evaluate(x)
    EvaluationInterface->>+OptimizationProblem: f(x)
    OptimizationProblem->>+User: f(x)
Loading

Where the EvaluationInterface is then implemented by both the EvaluationPipeline (i.e. the current "toolchain") and the SurrogateModel:

classDiagram
    class OptimizationProblem {
        evaluate_objectives(np.ndarray): np.ndarray
    }
    OptimizationProblem "1" *-- "1" EvaluationInterface
    
    class EvaluationInterface {
        <<interface>>
        +evaluate(np.ndarray): np.ndarray
    }
    
    SurrogateModel <|-- EvaluationInterface
    EvaluationPipeline <|-- EvaluationInterface
Loading

Conditioned optimization problems

One of the original ideas for this projects came from optimization problems where we want to fix the value of one of the variables and then run the optimization to find the best point given this value.

For this purpose, we should implement a ConditionedOptimizationProblem which wraps the original OptimizationProblem and provides an interface where the fixed variables are removed. While this is trivial to implement for bound constrained problems, it becomes potentially more complicated for problems with linear and nonlinear constraints.

Plots

For process design, we are often not really interested in just the optimal point but in the general topology of the parameter space. E.g. we are interested in the contours of regions with a given purity. Finely sampling the parameter space would be very expensive so the idea could be to use a surrogate model for this purpose. See also partial dependence plots and #33.

@schmoelder
Copy link
Contributor Author

Notes from Call with @maxsiska

Normalization / StandardScaler

  • Check if normalization allows for negative values
  • Check if log-normalizaiton is possible (note, might become an issue for linear constraints)
  • Check if inverse_transform (when estimating eval_functions) for return_cov is True works independent of used Scaler

X_scaler = StandardScaler().fit(X)
Y_scaler = StandardScaler().fit(Y)

gpr = GaussianProcessRegressor()
Copy link
Contributor Author

@schmoelder schmoelder Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider allowing specification of hyper parameters via kwargs (with reasonable defaults).

See also: https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html

Important parameters:

  • kernel: default is ConstantKernel(1.0, constant_value_bounds="fixed") * RBF(1.0, **length_scale_bounds="fixed") (consider moving to separate method, or expose their parameters in signature)
  • alpha: Parameter to handle noisy data, default=1e-10
  • optimizer: Custom optimizer for the kernel’s parameters (could improve performance, e.g. GA)
  • normalize_y (do we need to inverse transform when evaluating?)

Other suggestions:

  • Consider check if fit was "OK" (could also be part of some validation method)
  • Implement heuristic for length scale based on bounds (e.g. length_scale = 0.5 * (ub - lb))

) -> Any:

X = np.array(X)
X_2d = np.array(X, ndmin=2)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this does not "ensure 2D" (only ndmin=2). Consequently, we should instead reshape to really ensure 2d.

Also, consider moving function out of class since it duplicates what is done in the OptimizationProblem class.

X_test = surrogate.population.feasible.x[0:2]

F_test = surrogate.population.feasible.f[0:2]
F_est = surrogate.estimate_objectives(X_test)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we need to use a different validation set to test if surrogate does predict well enough.

@schmoelder schmoelder force-pushed the feature/surrogate branch 2 times, most recently from 2759b01 to 5252709 Compare August 14, 2024 11:39
@schmoelder schmoelder force-pushed the dev branch 11 times, most recently from 35e0c67 to d97cf31 Compare December 4, 2024 16:47
ronald-jaepel and others added 26 commits March 26, 2025 08:51
This plot could potentially replace the corner plot.
With this commit, also the behavior for deriving alternative solution
objects changes. Instead of modifying the original array, the following
methods / properties now return a new SolutionIO object:
- resample()
- normalize()
- (anti)derivative
- smooth_data()
* Do not round hopsy problem when computing chebyshev center

---------

Co-authored-by: r.jaepel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants