You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug: When running a training using the PPO algorithm, the params.json file in the experiment directory is not a valid JSON file representing the parameters of the experiment, while the `params.pkl' file is ok because it is the serialization of the PPOConfig object.
As an example, I run an experiment with a simple environment using PPO and in the result directory is the content of the params.json file:
$ ls ~/ray_results/PPO_SimplexTest_2025-01-24_15-26-24fb0ql0hd/
events.out.tfevents.1737728784.dfaas-marl params.json params.pkl progress.csv result.json
$ cat ~/ray_results/PPO_SimplexTest_2025-01-24_15-26-24fb0ql0hd/params.json
"<ray.rllib.algorithms.ppo.ppo.PPOConfig object at 0x7647680efe30>"
Expected behavior: The `params.json' file should contain a valid JSON representation of the PPOConfig object.
This is a low severity issue because the params.pkl is serialized correctly and Ray RLLib uses this file, not the JSON one, when loading from a checkpoint. I have not tried other algorithms. But anyway, the JSON is the only way to have an interoperable, human-readable, Python-version-independent log of the algorithm configuration.
Versions / Dependencies
Ubuntu 24.04.1 LTS wit Ray 2.40.0.
Reproduction script
Just run this script and then browse the results directory to find the `params.json' file.
importgymnasiumasgymfromray.rllib.algorithms.ppoimportPPOConfigfromray.rllib.utils.spaces.simpleximportSimplexfromray.tune.registryimportregister_envclassSimplexTest(gym.Env):
"""SimplexTest is a sample environment that basically does nothing."""def__init__(self, config=None):
self.action_space=Simplex(shape=(3,))
self.observation_space=gym.spaces.Box(shape=(1,), low=-1, high=1)
self.max_steps=100defreset(self, seed=None, options=None):
self.current_step=0obs=self.observation_space.sample()
returnobs, {}
defstep(self, action):
self.current_step+=1obs=self.observation_space.sample()
reward=self.np_random.random()
terminated=self.current_step==self.max_stepsreturnobs, reward, terminated, False, {}
register_env("SimplexTest", lambdaenv_config: SimplexTest(config=env_config))
if__name__=="__main__":
# Algorithm config.ppo_config= (
PPOConfig()
# By default RLlib uses the new API stack, but I use the old one.
.api_stack(
enable_rl_module_and_learner=False, enable_env_runner_and_connector_v2=False
)
.environment(env="SimplexTest")
.framework("torch")
.env_runners(num_env_runners=0) # Get experiences in the main process.
.evaluation(evaluation_interval=None) # No automatic evaluation.
.resources(num_gpus=1)
)
# Build the experiment.ppo_algo=ppo_config.build()
print(f"Algorithm initialized ({ppo_algo.logdir=}")
iterations=2print(f"Start of training ({iterations=})")
foriterationinrange(iterations):
print(f"Iteration {iteration}")
ppo_algo.train()
print("Training terminated")
ppo_algo.stop()
print("Training end")
Issue Severity
Low: It annoys or frustrates me.
The text was updated successfully, but these errors were encountered:
ema-pe
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Jan 24, 2025
What happened + What you expected to happen
Bug: When running a training using the PPO algorithm, the
params.json
file in the experiment directory is not a valid JSON file representing the parameters of the experiment, while the `params.pkl' file is ok because it is the serialization of the PPOConfig object.As an example, I run an experiment with a simple environment using PPO and in the result directory is the content of the
params.json
file:Expected behavior: The `params.json' file should contain a valid JSON representation of the PPOConfig object.
This is a low severity issue because the
params.pkl
is serialized correctly and Ray RLLib uses this file, not the JSON one, when loading from a checkpoint. I have not tried other algorithms. But anyway, the JSON is the only way to have an interoperable, human-readable, Python-version-independent log of the algorithm configuration.Versions / Dependencies
Ubuntu 24.04.1 LTS wit Ray 2.40.0.
Reproduction script
Just run this script and then browse the results directory to find the `params.json' file.
Issue Severity
Low: It annoys or frustrates me.
The text was updated successfully, but these errors were encountered: