-
Notifications
You must be signed in to change notification settings - Fork 0
Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a hyperparameter tuner using Optuna and updates reinforcement learning learners for Q-Learning and SARSA experiments. Key changes include:
- A new Tuner class in lab2/rl/tuner.py that automates hyperparameter optimization.
- An updated Learner base class and its derivatives in lab2/rl/learner.py for both CartPole and LunarLander environments.
- Enhancements to the command line interface and evaluation routines in lab2/rl.py along with supporting resource scripts and dependency updates.
Reviewed Changes
Copilot reviewed 25 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
lab2/rl/tuner.py | Introduces a new tuner for hyperparameter search using Optuna. |
lab2/rl/learner.py | Updates the abstract learner and its implementations for CartPole and LunarLander tasks. |
lab2/rl.py | Adds CLI commands, evaluation logic, and plotting for results visualization. |
lab2/resources/balance_new.py | Provides an alternative experimental script for balancing, using a Q-Learning approach with render mode. |
lab2/resources/balance.py | Contains a legacy QLearner implementation for CartPole. |
lab2/Pipfile | Updates dependencies and configuration, including new versions of required packages. |
Files not reviewed (2)
- lab2/.gitignore: Language not supported
- lab2/results/cart_pole_qlearning_tuning.csv: Language not supported
dc9025e
to
56cf0cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds a reinforcement learning lab framework with hyperparameter tuning via Optuna and implementations for Q-Learning and SARSA. Key changes include:
- Introducing the Tuner class in lab2/rl/tuner.py to manage hyperparameter optimization.
- Adding the Learner base class and its derived implementations for CartPole and LunarLander in lab2/rl/learner.py.
- Integrating tuning and evaluation workflows in lab2/rl.py and including resource scripts for balance simulations.
Reviewed Changes
Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
lab2/rl/tuner.py | Added Tuner class with objective function and study setup |
lab2/rl/learner.py | Introduced Learner base class and derived RL learners |
lab2/rl.py | Added main entry point for tuning and evaluating |
lab2/resources/balance_new.py | New resource script for balance simulation with render_mode |
lab2/resources/balance.py | Alternate balance simulation resource using gym |
lab2/Pipfile | Updated dependency file for the project |
Files not reviewed (2)
- lab2/.gitignore: Language not supported
- lab2/results/cart_pole_qlearning_tuning.csv: Language not supported
Comments suppressed due to low confidence (3)
lab2/rl/tuner.py:14
- [nitpick] The parameter name 'Learner' is capitalized, which might be confused with a class name. Consider renaming it to 'learner_class' for clarity.
def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...
lab2/rl.py:196
- [nitpick] The variable name 'df' is later reused to store the discount factor from parameters, which may lead to confusion. Please consider using a more descriptive name (e.g., discount_factor) for the discount factor variables.
df = pd.DataFrame(all_rewards, columns=["attempt", "reward"])
lab2/resources/balance_new.py:4
- [nitpick] There is an inconsistency in gym imports between 'gym' and the commented 'gymnasium' import. Aligning on a consistent gym API (or clarifying the intended use) would improve code clarity.
# import gymnasium as gym
Add generic interfaces for tuning and evaluation by refactoring the code. Add lunar lander environment.
56cf0cf
to
96f1940
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a reinforcement learning laboratory with two primary learning algorithms (Q-Learning and SARSA) and integrates hyperparameter optimization using Optuna. Key changes include:
- Implementation of a Tuner class for automatic hyperparameter tuning.
- Enhancements and additions to the learner modules (for CartPole and LunarLander) along with auxiliary resource files.
- A new main module to coordinate tuning and evaluation along with updates to dependency configuration.
Reviewed Changes
Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
lab2/rl/tuner.py | New Tuner class implementing Optuna-based hyperparameter tuning for Q-Learning and SARSA. |
lab2/rl/learner.py | Learner class and its concrete implementations for CartPole and LunarLander with Q-value update mechanisms. |
lab2/rl.py | Main entry point integrating tuning and evaluation for the RL experiments. |
lab2/resources/balance_new.py | A resource file introducing a QLearner variant using gym with render_mode, albeit with a placeholder discretise. |
lab2/resources/balance.py | Another QLearner variant with a similar structure to balance_new.py. |
lab2/Pipfile | Dependency and version management ensuring consistent environments. |
Files not reviewed (2)
- lab2/.gitignore: Language not supported
- lab2/results/cart_pole_qlearning_tuning.csv: Language not supported
Comments suppressed due to low confidence (2)
lab2/rl/tuner.py:14
- [nitpick] Consider renaming the parameter 'Learner' (and corresponding attribute) to 'learner_class' or similar to clearly indicate it expects a class type, thereby avoiding potential confusion with instance names.
def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...)
lab2/resources/balance_new.py:3
- [nitpick] The project generally uses 'gymnasium' in other modules; to maintain consistency, consider updating the import to 'gymnasium' (and adjust usage as needed) or unify the library choice across resource files.
import gym
No description provided.