Skip to content

Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 21, 2025

Conversation

Szaroslav
Copy link
Owner

No description provided.

@Szaroslav Szaroslav added the enhancement New feature or request label Apr 21, 2025
@Szaroslav Szaroslav requested a review from Copilot April 21, 2025 18:14
@Szaroslav Szaroslav self-assigned this Apr 21, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a hyperparameter tuner using Optuna and updates reinforcement learning learners for Q-Learning and SARSA experiments. Key changes include:

  • A new Tuner class in lab2/rl/tuner.py that automates hyperparameter optimization.
  • An updated Learner base class and its derivatives in lab2/rl/learner.py for both CartPole and LunarLander environments.
  • Enhancements to the command line interface and evaluation routines in lab2/rl.py along with supporting resource scripts and dependency updates.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
lab2/rl/tuner.py Introduces a new tuner for hyperparameter search using Optuna.
lab2/rl/learner.py Updates the abstract learner and its implementations for CartPole and LunarLander tasks.
lab2/rl.py Adds CLI commands, evaluation logic, and plotting for results visualization.
lab2/resources/balance_new.py Provides an alternative experimental script for balancing, using a Q-Learning approach with render mode.
lab2/resources/balance.py Contains a legacy QLearner implementation for CartPole.
lab2/Pipfile Updates dependencies and configuration, including new versions of required packages.
Files not reviewed (2)
  • lab2/.gitignore: Language not supported
  • lab2/results/cart_pole_qlearning_tuning.csv: Language not supported

@Szaroslav Szaroslav force-pushed the lab2-reinforcement-learning-qlearning-and-sarsa branch from dc9025e to 56cf0cf Compare April 21, 2025 18:17
@Szaroslav Szaroslav requested a review from Copilot April 21, 2025 18:18
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a reinforcement learning lab framework with hyperparameter tuning via Optuna and implementations for Q-Learning and SARSA. Key changes include:

  • Introducing the Tuner class in lab2/rl/tuner.py to manage hyperparameter optimization.
  • Adding the Learner base class and its derived implementations for CartPole and LunarLander in lab2/rl/learner.py.
  • Integrating tuning and evaluation workflows in lab2/rl.py and including resource scripts for balance simulations.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lab2/rl/tuner.py Added Tuner class with objective function and study setup
lab2/rl/learner.py Introduced Learner base class and derived RL learners
lab2/rl.py Added main entry point for tuning and evaluating
lab2/resources/balance_new.py New resource script for balance simulation with render_mode
lab2/resources/balance.py Alternate balance simulation resource using gym
lab2/Pipfile Updated dependency file for the project
Files not reviewed (2)
  • lab2/.gitignore: Language not supported
  • lab2/results/cart_pole_qlearning_tuning.csv: Language not supported
Comments suppressed due to low confidence (3)

lab2/rl/tuner.py:14

  • [nitpick] The parameter name 'Learner' is capitalized, which might be confused with a class name. Consider renaming it to 'learner_class' for clarity.
def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...

lab2/rl.py:196

  • [nitpick] The variable name 'df' is later reused to store the discount factor from parameters, which may lead to confusion. Please consider using a more descriptive name (e.g., discount_factor) for the discount factor variables.
df = pd.DataFrame(all_rewards, columns=["attempt", "reward"])

lab2/resources/balance_new.py:4

  • [nitpick] There is an inconsistency in gym imports between 'gym' and the commented 'gymnasium' import. Aligning on a consistent gym API (or clarifying the intended use) would improve code clarity.
# import gymnasium as gym

@Szaroslav Szaroslav force-pushed the lab2-reinforcement-learning-qlearning-and-sarsa branch from 56cf0cf to 96f1940 Compare April 21, 2025 18:25
@Szaroslav Szaroslav requested a review from Copilot April 21, 2025 18:26
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a reinforcement learning laboratory with two primary learning algorithms (Q-Learning and SARSA) and integrates hyperparameter optimization using Optuna. Key changes include:

  • Implementation of a Tuner class for automatic hyperparameter tuning.
  • Enhancements and additions to the learner modules (for CartPole and LunarLander) along with auxiliary resource files.
  • A new main module to coordinate tuning and evaluation along with updates to dependency configuration.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lab2/rl/tuner.py New Tuner class implementing Optuna-based hyperparameter tuning for Q-Learning and SARSA.
lab2/rl/learner.py Learner class and its concrete implementations for CartPole and LunarLander with Q-value update mechanisms.
lab2/rl.py Main entry point integrating tuning and evaluation for the RL experiments.
lab2/resources/balance_new.py A resource file introducing a QLearner variant using gym with render_mode, albeit with a placeholder discretise.
lab2/resources/balance.py Another QLearner variant with a similar structure to balance_new.py.
lab2/Pipfile Dependency and version management ensuring consistent environments.
Files not reviewed (2)
  • lab2/.gitignore: Language not supported
  • lab2/results/cart_pole_qlearning_tuning.csv: Language not supported
Comments suppressed due to low confidence (2)

lab2/rl/tuner.py:14

  • [nitpick] Consider renaming the parameter 'Learner' (and corresponding attribute) to 'learner_class' or similar to clearly indicate it expects a class type, thereby avoiding potential confusion with instance names.
def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...)

lab2/resources/balance_new.py:3

  • [nitpick] The project generally uses 'gymnasium' in other modules; to maintain consistency, consider updating the import to 'gymnasium' (and adjust usage as needed) or unify the library choice across resource files.
import gym

@Szaroslav Szaroslav merged commit f7dd504 into main Apr 21, 2025
@Szaroslav Szaroslav deleted the lab2-reinforcement-learning-qlearning-and-sarsa branch April 21, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant