Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Jacobians for GasKinetics #1089

Merged
merged 39 commits into from
Jan 14, 2022
Merged

Conversation

ischoegl
Copy link
Member

@ischoegl ischoegl commented Aug 25, 2021

Changes proposed in this pull request

  • Add sparse Jacobians for GasKinetics
  • Implement mechanism to adjust behavior of Jacobian evaluation via gas.jacobian_settings (Python API)
  • Allow for disabling of terms to obtain approximate results
  • Cover changes with extensive test suite
  • Sparse output to Python is supported via ct.use_sparse() (requires Scipy)

The proposed enhancement is fully functional for Jacobians at constant pressure; some caveats remain:

  • As the transition from old to new ReactionRate is not complete (Falloff reactions remain to be ported), some band-aid implementations were necessary, which mostly affect temperature derivatives and a missing derivative term related to reaction rates depending on third-body concentrations.addressed
  • While the calculation of individual terms is optimized for speed, the addition of multiple sparse Jacobian terms is slow. A solution would involve the creation of internal buffers. As Jacobians are expected to be used elsewhere, I am leaving this up for discussion.
  • Derivatives with respect to pressure - while relatively straightforward - are currently not implemented.addressed

While I am leaving the PR in draft status, - (edit: work is no longer draft) - I am explicitly welcoming feedback at this moment (@speth, @kyleniemeyer, @bryanwweber , @anthony-walker, @DavidAkinpelu, among others). It may make sense to discuss some more generic issues in Cantera/enhancements#100.

If applicable, fill in the issue number this pull request is fixing

Closes #Cantera/enhancements#19

Supersedes #1081; incorporates changes proposed in #1088.

If applicable, provide an example illustrating new features this pull request is introducing

While calculations are handled by <Eigen/Sparse> internally, results are exposed to the Python API as:

>>> gas.net_rates_of_progress_ddX # similar for forward/reverse
>>> gas.net_rates_of_progress_ddT # similar for forward/reverse
>>> gas.net_rates_of_progress_ddP # similar for forward/reverse
>>> gas.net_rates_of_progress_ddC # similar for forward/reverse
>>> gas.net_production_rate_ddX # similar for creation/destruction
>>> gas.net_production_rate_ddT # similar for creation/destruction
>>> gas.net_production_rate_ddP # similar for creation/destruction
>>> gas.net_production_rate_ddC # similar for creation/destruction

In order to avoid a lengthy interface, derivative evaluation depends on 'settings'.

>>> gas.derivative_settings
{'skip-third-bodies': False, 'skip-falloff': False, 'rtol-delta': 1e-08}

All derivatives are checked against numerical implementations in unit tests.
Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

Additional items:

  • The PR is ready for testing
  • The PR is feature complete / ready to be merged

@codecov
Copy link

codecov bot commented Aug 25, 2021

Codecov Report

Merging #1089 (1ecf4ed) into main (591c28e) will increase coverage by 0.19%.
The diff coverage is 80.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1089      +/-   ##
==========================================
+ Coverage   65.15%   65.35%   +0.19%     
==========================================
  Files         315      315              
  Lines       45389    45995     +606     
  Branches    19279    19531     +252     
==========================================
+ Hits        29574    30060     +486     
- Misses      13347    13443      +96     
- Partials     2468     2492      +24     
Impacted Files Coverage Δ
include/cantera/base/global.h 84.21% <ø> (ø)
include/cantera/kinetics/Arrhenius.h 100.00% <ø> (+2.22%) ⬆️
include/cantera/kinetics/GasKinetics.h 100.00% <ø> (ø)
include/cantera/kinetics/Kinetics.h 38.79% <0.00%> (-16.09%) ⬇️
include/cantera/kinetics/MultiRateBase.h 100.00% <ø> (ø)
include/cantera/kinetics/ReactionRate.h 86.48% <ø> (ø)
include/cantera/thermo/IdealMolalSoln.h 50.00% <0.00%> (-50.00%) ⬇️
include/cantera/thermo/IdealSolidSolnPhase.h 78.94% <0.00%> (-9.29%) ⬇️
include/cantera/thermo/IdealSolnGasVPSS.h 50.00% <0.00%> (-50.00%) ⬇️
include/cantera/thermo/ThermoPhase.h 30.57% <0.00%> (-0.40%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 591c28e...1ecf4ed. Read the comment docs.

@ischoegl

This comment has been minimized.

@ischoegl ischoegl marked this pull request as ready for review December 25, 2021 18:34
@ischoegl ischoegl requested a review from speth December 25, 2021 18:36
@ischoegl ischoegl force-pushed the sparse-jacobians branch 2 times, most recently from 376faa5 to f2d5fdc Compare December 26, 2021 18:30
@ischoegl
Copy link
Member Author

ischoegl commented Dec 26, 2021

Adding a sample for speed tests (samples/cxx/jacobian/jacobian_speed.cpp):

$ ./jacobian_speed 
Benchmark tests for jacobian evaluations.

getFwdRatesOfProgress:  10.701 μs ± 0.133 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddT: 14.408 μs ± 0.119 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddP: 11.15 μs ± 0.047 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddC: 16.042 μs ± 1.2 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddX: 40.282 μs ± 0.882 μs per loop (7 runs, 1000 loops each)

getRevRatesOfProgress:  10.796 μs ± 0.0469 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddT: 16.937 μs ± 0.396 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddP: 11.613 μs ± 0.289 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddC: 15.889 μs ± 0.375 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddX: 40.764 μs ± 0.318 μs per loop (7 runs, 1000 loops each)

getNetRatesOfProgress:  10.977 μs ± 0.123 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddT: 17.276 μs ± 1.29 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddP: 11.36 μs ± 0.0522 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddC: 19.624 μs ± 0.24 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddX: 95.056 μs ± 1.22 μs per loop (7 runs, 1000 loops each)

getCreationRates:       12.363 μs ± 0.158 μs per loop (7 runs, 10000 loops each)
creationRates_ddT:      26.087 μs ± 1.78 μs per loop (7 runs, 10000 loops each)
creationRates_ddP:      16.865 μs ± 0.697 μs per loop (7 runs, 10000 loops each)
creationRates_ddC:      25.362 μs ± 0.451 μs per loop (7 runs, 10000 loops each)
creationRates_ddX:      172.49 μs ± 5.23 μs per loop (7 runs, 1000 loops each)

getDestructionRates:    14.38 μs ± 1.79 μs per loop (7 runs, 10000 loops each)
destructionRates_ddT:   27.015 μs ± 1.77 μs per loop (7 runs, 10000 loops each)
destructionRates_ddP:   15.656 μs ± 0.0515 μs per loop (7 runs, 10000 loops each)
destructionRates_ddC:   24.765 μs ± 1.55 μs per loop (7 runs, 10000 loops each)
destructionRates_ddX:   168.58 μs ± 5.99 μs per loop (7 runs, 1000 loops each)

getNetProductionRates:  13.051 μs ± 0.502 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddT: 20.761 μs ± 0.269 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddP: 15.04 μs ± 0.292 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddC: 24.406 μs ± 3.1 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddX: 151.87 μs ± 1.37 μs per loop (7 runs, 1000 loops each)

@ischoegl
Copy link
Member Author

ischoegl commented Dec 27, 2021

I believe this is ready for review / testing. While some of the CI runners fail at the moment, none of the failures are due to cantera (there appear to be some issues with downloads of external repositories / submodules at the moment).

@ischoegl ischoegl force-pushed the sparse-jacobians branch 2 times, most recently from 5efb7e5 to d362122 Compare December 29, 2021 15:20
@ischoegl
Copy link
Member Author

Adding a minor update to the Jacobian benchmark sample to illustrate the impact of number of species on performance. Here are some results for doubling the number of species ('nDodecane_Reitz.yaml`):

nDodecane_Reitz.yaml: 100 species, 553 reactions.

getFwdRatesOfProgress:  12.242 μs ± 0.0352 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddT: 14.871 μs ± 0.0279 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddP: 12.999 μs ± 0.0848 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddC: 15.752 μs ± 0.0196 μs per loop (7 runs, 10000 loops each)
fwdRatesOfProgress_ddX: 55.234 μs ± 0.262 μs per loop (7 runs, 1000 loops each)

getRevRatesOfProgress:  12.272 μs ± 0.0395 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddT: 18.223 μs ± 0.0432 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddP: 12.962 μs ± 0.0295 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddC: 15.539 μs ± 0.0159 μs per loop (7 runs, 10000 loops each)
revRatesOfProgress_ddX: 51.379 μs ± 0.14 μs per loop (7 runs, 1000 loops each)

getNetRatesOfProgress:  12.238 μs ± 0.00841 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddT: 18.186 μs ± 0.0357 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddP: 12.968 μs ± 0.0167 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddC: 19.076 μs ± 0.03 μs per loop (7 runs, 10000 loops each)
netRatesOfProgress_ddX: 116.62 μs ± 0.261 μs per loop (7 runs, 1000 loops each)

getCreationRates:       13.907 μs ± 0.0381 μs per loop (7 runs, 10000 loops each)
creationRates_ddT:      26.293 μs ± 0.0615 μs per loop (7 runs, 10000 loops each)
creationRates_ddP:      18.969 μs ± 0.0303 μs per loop (7 runs, 10000 loops each)
creationRates_ddC:      24.373 μs ± 0.0342 μs per loop (7 runs, 10000 loops each)
creationRates_ddX:      212.18 μs ± 0.327 μs per loop (7 runs, 1000 loops each)

getDestructionRates:    13.362 μs ± 0.0515 μs per loop (7 runs, 10000 loops each)
destructionRates_ddT:   26.806 μs ± 0.0375 μs per loop (7 runs, 10000 loops each)
destructionRates_ddP:   18.765 μs ± 0.0255 μs per loop (7 runs, 10000 loops each)
destructionRates_ddC:   24.123 μs ± 0.0504 μs per loop (7 runs, 10000 loops each)
destructionRates_ddX:   201.09 μs ± 0.3 μs per loop (7 runs, 1000 loops each)

getNetProductionRates:  13.891 μs ± 0.0461 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddT: 20.557 μs ± 0.0467 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddP: 15.341 μs ± 0.0544 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddC: 21.505 μs ± 0.0493 μs per loop (7 runs, 10000 loops each)
netProductionRates_ddX: 202.07 μs ± 0.276 μs per loop (7 runs, 1000 loops each)

Nothing else changed (I'm aware of some pending review comments).

@ischoegl
Copy link
Member Author

ischoegl commented Jan 13, 2022

@anthony-walker ... thanks for your thoughts!

You mentioned that you have created some place holder functions to do this? I can start trying to merge the two branchs.

Yes and no. For the C++API, the main placeholders are in Kinetics.h:

void getJacobianSettings(AnyMap& settings);
void setJacobianSettings(AnyMap& settings);

which presumably should govern how derivatives are calculated (various assumptions for relative perturbations and treatment of 3rd-bodies).

I have found the most success with passing and returning raw pointers that map to Eigen objects.

That makes sense. For derivatives with respect to scalars, this is straight forward:

void getNetProductionRates(doublereal* wdot); // existing
void getNetProductionRates_ddT(double* dwdot); // in analogy ...
void getNetProductionRates_ddP(double* dwdot);
void getNetProductionRates_ddC(double* dwdot);

But for the derivative with respect to mole fractions (void getNetProductionRates_ddX) I'm not sure. I believe it would make sense to directly write into the global Eigen::SparseMatrix? For this, I believe this would take passing a size_t offset value (and this offset needs to be threaded through internally).

@speth
Copy link
Member

speth commented Jan 13, 2022

But for the derivative with respect to mole fractions (void getNetProductionRates_ddX) I'm not sure. I believe it would make sense to directly write into the global Eigen::SparseMatrix? For this, I believe this would take passing a size_t offset value (and this offset needs to be threaded through internally).

I had been thinking about the need for an offset argument as well, so that may be a key part of the interface expansion. What I'm less sure about is whether it's generally useful to just update an existing SparseMatrix or whether it would be better to simply return a SparseMatrix with that offset applied. I think in a lot of cases, you're going to need to manipulate those entries anyway, for example to go from d/dX at constant P and molar volume to d/dn at constant P only. Also, I think the primary efficient way of building a SparseMatrix is from the list of triplets, and I don't know if there's a comparably efficient path for updating an existing matrix.

@anthony-walker
Copy link
Contributor

@ischoegl

You mentioned that you have created some place holder functions to do this? I can start trying to merge the two branchs.

Yes and no. For the C++API, the main placeholders are in Kinetics.h:

void getJacobianSettings(AnyMap& settings);
void setJacobianSettings(AnyMap& settings);

Are these methods to apply a custom mapping to the a sparse representation of the derivatives? If so do you have methods to form the sparse mapping? I use the current ReactionDerivativeManager in #1010 to do both to some extent.

void getNetProductionRates(doublereal* wdot); // existing
void getNetProductionRates_ddT(double* dwdot); // in analogy ...
void getNetProductionRates_ddP(double* dwdot);
void getNetProductionRates_ddC(double* dwdot);

I am also using derivatives with respect to moles in #1010, is that a possibility here?

@ischoegl
Copy link
Member Author

ischoegl commented Jan 13, 2022

@anthony-walker ... the jacobian settings are mainly to pass some assumptions to C++, e.g. from Python

>>> gas.jacobian_settings
{'skip-third-bodies': False, 'skip-falloff': False, 'rtol-delta': 1e-08}

I am also using derivatives with respect to moles in #1010, is that a possibility here?

... total or individual? Here, getNetProductionRates_ddC is the derivative with respect to (total) molar concentration.

@anthony-walker
Copy link
Contributor

anthony-walker commented Jan 13, 2022

@ischoegl Ah, I would most likely use the C++ interface. I will need a way to remap the output which I believe I discussed a bit in #1010. I can work on that when I start merging the two of them.

@anthony-walker ... the jacobian settings are mainly to pass some assumptions to C++, e.g. from Python

My approximate Jacobian based preconditioner is built from derivatives with respect to moles of individual species.

... total or individual? Here, getNetProductionRates_ddC is the derivative with respect to (total) molar concentration.

@ischoegl
Copy link
Member Author

ischoegl commented Jan 13, 2022

@speth / @anthony-walker ... thanks for your comments thus far. Based on what was said, I believe I will proceed as follows:

  • Rename get/setJacobianSettings to get/setDerivativeSettings
  • Change the *_ddT, *_ddP, *_ddC methods to use raw pointers

but leave *_ddX as is.

@ischoegl
Copy link
Member Author

ischoegl commented Jan 13, 2022

@anthony-walker

I would most likely use the C++ interface.

It is equivalent, and uses an AnyMap instead of a dictionary.

I will need a way to remap the output which I believe I discussed a bit in #1010. I can work on that when I start merging the two of them.

I don't think that this is mutually exclusive: this PR should be using the same mapping as we discussed a while back; it's just that entries are shifted by an offset within the sparse element storage vector.

My approximate Jacobian based preconditioner is built from derivatives with respect to moles of individual species

All it takes is to multiply _ddX by the molar concentration. It would be simple to pass an optional scaling factor.

Note: As mentioned somewhere above (in the collapsed portion) ... the derivatives here do not incorporate constraints posed by the EoS, which may be another difference.

Also, add missing 'final' to Blowers-Masel rate
@ischoegl
Copy link
Member Author

ischoegl commented Jan 14, 2022

@speth ... thanks for responding to my queries quickly! At this point, I believe everything that was pointed out is addressed; let me know if there's anything else.

Otherwise, this is ready for another round of reviews from my end.

@ischoegl
Copy link
Member Author

ischoegl commented Jan 14, 2022

PS: While I'd suggest to leave tweaks of the internal C++ _ddX API so it fits with #1010 for a follow-up PR, I wanted to respond to one comment:

I had been thinking about the need for an offset argument as well, so that may be a key part of the interface expansion. What I'm less sure about is whether it's generally useful to just update an existing SparseMatrix or whether it would be better to simply return a SparseMatrix with that offset applied. I think in a lot of cases, you're going to need to manipulate those entries anyway, for example to go from d/dX at constant P and molar volume to d/dn at constant P only. Also, I think the primary efficient way of building a SparseMatrix is from the list of triplets, and I don't know if there's a comparably efficient path for updating an existing matrix.

One alternative would be to pass a fully created SparseMatrix (initially zeroed out, but with properly referenced memory layout), and have the _ddX methods modify values in memory (i.e. equivalent to += or -=); correct indices should be easily found based on the offset argument and internally stored indices. At least in this PR, I tried to avoid constructing new SparseMatrix variables from triplet lists at all cost: it is my understanding that anything that involves indexing/building is slow, which is why I resorted to accessing the underlying memory map directly and otherwise relied on copy constructors.

PPS: The best approach here may be to leave the current C++ getters that return Eigen::SparseMatrix<double> for Python (or any other higher-level API), but also have an alternative void accessor that passes a fully formed SparseMatrix for internal use within the C++ core.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @ischoegl. Looks like you also found my one last request for update (regarding ReactionData::restore) that GitHub was trying to hide.

@speth speth merged commit ec85c24 into Cantera:main Jan 14, 2022
@ischoegl
Copy link
Member Author

@speth ... thanks for your prompt (and thorough!) review! Beyond: @anthony-walker I'd be happy to assist with blending this with #1010. Let me know ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants