Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces (ICML 2024)

This is the source code accompanying the paper Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces by Brahma S. Pavse, Matthew Zurek, Yudong Chen, Qiaomin Xie, Josiah P. Hanna.

Setting up the environment

conda env create -f environment.yml

Running the code

Generic command:

python run_single_continual.py  --outfile <result_file> --env_name <queue/nmodel> --mdp_num <0/1/2> --deployed_interaction_steps 5_000_000  --exp_name <exp_name>  --reward_function <opt/stab>  --seed 0  --truncated_horizon 200 --algo_name <algo_name> --lr 3e-4 --state_transformation <state_trans> --lyp_power <p> --adam_beta 0.9

where,

exp_name can be anything
reward_function is either opt for optimal only or stab for optimal + stability
algo_name is either MW, PPO, or STOP-suffix where suffix can be anything to uniquely identify the algorithm run based state transformation and lyp power. Example: STOP-SL-2, denotes STOP with symloge and p = 2
state_transformation is either id, sigmoid, symsqrt, symloge
lyp_power is any floating number (p from the paper)

Example command:

python run_single_continual.py  --outfile result_file --env_name queue --mdp_num 2 --deployed_interaction_steps 5_000_000  --exp_name test  --reward_function stab  --seed 0  --truncated_horizon 200 --algo_name STOP-3 --lr 3e-4 --state_transformation sigmoid --lyp_power 3 --adam_beta 0.9

Citation

If you found any part of this code useful, please consider citing our paper:

@inproceedings{
    pavse2024unbounded,
    title={Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces},
    author={Brahma S. Pavse, Matthew Zurek, Yudong Chen, Qiaomin Xie, Josiah P. Hanna},
    booktitle={Forty-first International Conference on Machine Learning},
    year={2024},
    url={https://openreview.net/forum?id=64fdhmogiD}
}

Contact

If you have any questions, please feel free to email: [email protected]!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cleanrl_algo		cleanrl_algo
resco_benchmark		resco_benchmark
sumo		sumo
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
exp.sh		exp.sh
image.def		image.def
nmodel.py		nmodel.py
plot.py		plot.py
plot_custom_utils.py		plot_custom_utils.py
policies.py		policies.py
run_condor_rews.py		run_condor_rews.py
run_single_continual.py		run_single_continual.py
run_traffic.py		run_traffic.py
server_allocation.py		server_allocation.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces (ICML 2024)

Setting up the environment

Running the code

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

Badger-RL/STOP

Folders and files

Latest commit

History

Repository files navigation

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces (ICML 2024)

Setting up the environment

Running the code

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages