[RLlib] `params.json` is not a valid JSON file when using PPO #50051

ema-pe · 2025-01-24T14:38:15Z

What happened + What you expected to happen

Bug: When running a training using the PPO algorithm, the params.json file in the experiment directory is not a valid JSON file representing the parameters of the experiment, while the `params.pkl' file is ok because it is the serialization of the PPOConfig object.

As an example, I run an experiment with a simple environment using PPO and in the result directory is the content of the params.json file:

$ ls ~/ray_results/PPO_SimplexTest_2025-01-24_15-26-24fb0ql0hd/
events.out.tfevents.1737728784.dfaas-marl  params.json  params.pkl  progress.csv  result.json
$ cat ~/ray_results/PPO_SimplexTest_2025-01-24_15-26-24fb0ql0hd/params.json
"<ray.rllib.algorithms.ppo.ppo.PPOConfig object at 0x7647680efe30>"

Expected behavior: The `params.json' file should contain a valid JSON representation of the PPOConfig object.

This is a low severity issue because the params.pkl is serialized correctly and Ray RLLib uses this file, not the JSON one, when loading from a checkpoint. I have not tried other algorithms. But anyway, the JSON is the only way to have an interoperable, human-readable, Python-version-independent log of the algorithm configuration.

Versions / Dependencies

Ubuntu 24.04.1 LTS wit Ray 2.40.0.

Reproduction script

Just run this script and then browse the results directory to find the `params.json' file.

import gymnasium as gym

from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.utils.spaces.simplex import Simplex
from ray.tune.registry import register_env


class SimplexTest(gym.Env):
    """SimplexTest is a sample environment that basically does nothing."""

    def __init__(self, config=None):
        self.action_space = Simplex(shape=(3,))
        self.observation_space = gym.spaces.Box(shape=(1,), low=-1, high=1)
        self.max_steps = 100

    def reset(self, seed=None, options=None):
        self.current_step = 0

        obs = self.observation_space.sample()
        return obs, {}

    def step(self, action):
        self.current_step += 1

        obs = self.observation_space.sample()
        reward = self.np_random.random()
        terminated = self.current_step == self.max_steps
        return obs, reward, terminated, False, {}


register_env("SimplexTest", lambda env_config: SimplexTest(config=env_config))


if __name__ == "__main__":
    # Algorithm config.
    ppo_config = (
        PPOConfig()
        # By default RLlib uses the new API stack, but I use the old one.
        .api_stack(
            enable_rl_module_and_learner=False, enable_env_runner_and_connector_v2=False
        )
        .environment(env="SimplexTest")
        .framework("torch")
        .env_runners(num_env_runners=0)  # Get experiences in the main process.
        .evaluation(evaluation_interval=None)  # No automatic evaluation.
        .resources(num_gpus=1)
    )

    # Build the experiment.
    ppo_algo = ppo_config.build()
    print(f"Algorithm initialized ({ppo_algo.logdir = }")

    iterations = 2
    print(f"Start of training ({iterations = })")
    for iteration in range(iterations):
        print(f"Iteration {iteration}")
        ppo_algo.train()
    print("Training terminated")

    ppo_algo.stop()
    print("Training end")

Issue Severity

Low: It annoys or frustrates me.

The text was updated successfully, but these errors were encountered:

ema-pe added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] `params.json` is not a valid JSON file when using PPO #50051

[RLlib] `params.json` is not a valid JSON file when using PPO #50051

ema-pe commented Jan 24, 2025

[RLlib] params.json is not a valid JSON file when using PPO #50051

[RLlib] params.json is not a valid JSON file when using PPO #50051

Comments

ema-pe commented Jan 24, 2025

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

[RLlib] `params.json` is not a valid JSON file when using PPO #50051

[RLlib] `params.json` is not a valid JSON file when using PPO #50051