Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime
We make a compelling case for the inclusion of domain knowledge to efficiently obtain robust control of chaotic flows. On the left is the typical performance of a Domain-Informed agent on a random initial condition, and on the right is the typical performance of an uninformed agent. Althoug the both agents achieve a similar reduction of average convective heat transfer under initially chaotic flow conditions, note the large qualitative differences in the flows that are eventually obtained:
- The domain-informed agent always achieves steady flow, and keeps this state stable.
- The uninformed agent exhibits a significantly unsteady flow.
Domain-Informed | Uninformed |
---|---|
![]() |
![]() |
Data-driven flow control has significant potential for indus- try, energy systems, and climate science. In this work, we study the ef- fectiveness of Reinforcement Learning (RL) for reducing convective fluid flows in the 2D Rayleigh-Bénard Convection (RBC) system under in- creasing turbulence. We investigate the generalizability of control across varying initial conditions and turbulence levels and introduce a reward shaping technique to accelerate the training. RL agents trained via single- agent Proximal Policy Optimization (PPO) are compared to linear pro- portional derivative (PD) controllers from conventional control theory. The RL agents reduced convection, measured by the Nusselt Number, by up to 33% in moderately turbulent systems and 10% in highly tur- bulent settings, clearly outperforming PD control in all settings. The agents showed strong generalization performance across different initial conditions and to a significant extent, generalized to higher turbulent set- tings. The reward shaping improved sample efficiency and consistently stabilized the Nusselt Number to higher turbulence levels.
This work uses Fourier Neural Operators to model Rayleigh-Bénard Convection (RBC). RBC describes convection processes in a layer of fluid cooled from the top and heated from the bottom via the partial differential equations:
Rayleigh-Bénard Convection
The surrogate models are trained on data generated by a Direct Numerical Simulation based on Shenfun with the following parameters:
Parameter | Value | Parameter | Value | |
---|---|---|---|---|
Domain | ((-1, 1),(0, |
( |
(1,2) | |
Grid | 64 x 96 | 0.025 | ||
Rayleigh Number | {1e5, 1e6, 2e6, 5e6} | Episode Length | 300 | |
Prandtl Number | 0.7 | Cook Time | 200 |
We use uv to manage python dependencies.
uv sync
Alternatively, you can use virtual environments and install this package. Dependencies are given in pyproject.toml.
This script gives a live view of the trained agent's behavior:
uv run python scripts/sbsa_test.py
Adapt the configuration in config/sbsa.yaml
and run the training script:
uv run python scripts/sbsa.py
If you find our work useful, please cite us via:
@article{todo,
title={Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime},
author={Markmann, Thorben and Straat, Michiel and Peitz, Sebastian and Hammer, Barbara},
journal={},
year={}
}