0.0.2 #11

MeFredFeng · 2025-09-28T05:07:19Z

This pull request introduces several key updates to the reinforcement learning solvers and project environment setup. The most significant changes include the implementation of core algorithms for Monte Carlo, Policy Iteration, and Value Iteration solvers, as well as the addition of a new Conda environment configuration for macOS. These updates improve the functionality and usability of the codebase, making the solvers ready for experimentation and development.

Algorithm Implementations

Added the core Monte Carlo algorithm, including episode generation, Q-value updates, and policy functions for both on-policy and off-policy learning in Monte_Carlo.py. This includes epsilon-soft and greedy policy implementations. [1] [2] [3] [4]
Implemented Policy Iteration logic: updated policy improvement with one-step lookahead and policy evaluation using matrix methods in Policy_Iteration.py. [1] [2]
Completed Value Iteration updates: added one-step lookahead for value updates and policy extraction, and improved prioritized sweeping logic in Value_Iteration.py. [1] [2] [3]

Environment and Project Setup

Added a new Conda environment file environment_mac_mod.yml to facilitate reproducible setup on macOS, including all necessary Python dependencies for running the solvers.
Updated .idea/.gitignore to ignore IDE-specific files and folders, improving repository cleanliness for development.

Add environment configuration for macOS and IntelliJ IDEA gitignore

Implement value updates and action selection in Value Iteration

Implement policy evaluation and update in Policy Iteration; optimize value updates in Value Iteration

1. Implement Monte Carlo and Off-Policy Monte Carlo methods in Monte_Carlo.py; 2. Refine policy functions in Policy_Iteration.py and Value_Iteration.py.

MeFredFeng added 4 commits September 7, 2025 16:48

[0.0.0]

4d0e840

Add environment configuration for macOS and IntelliJ IDEA gitignore

[0.0.1]

00841c6

Implement value updates and action selection in Value Iteration

[0.0.1a]

ad74447

Implement policy evaluation and update in Policy Iteration; optimize value updates in Value Iteration

[0.0.2]

ec4d05d

1. Implement Monte Carlo and Off-Policy Monte Carlo methods in Monte_Carlo.py; 2. Refine policy functions in Policy_Iteration.py and Value_Iteration.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.0.2 #11

0.0.2 #11

Uh oh!

MeFredFeng commented Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

0.0.2 #11

Are you sure you want to change the base?

0.0.2 #11

Uh oh!

Conversation

MeFredFeng commented Sep 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant