Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

Szaroslav · 2025-04-21T18:14:15Z

No description provided.

Copilot

Pull Request Overview

This PR introduces a hyperparameter tuner using Optuna and updates reinforcement learning learners for Q-Learning and SARSA experiments. Key changes include:

A new Tuner class in lab2/rl/tuner.py that automates hyperparameter optimization.
An updated Learner base class and its derivatives in lab2/rl/learner.py for both CartPole and LunarLander environments.
Enhancements to the command line interface and evaluation routines in lab2/rl.py along with supporting resource scripts and dependency updates.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
lab2/rl/tuner.py	Introduces a new tuner for hyperparameter search using Optuna.
lab2/rl/learner.py	Updates the abstract learner and its implementations for CartPole and LunarLander tasks.
lab2/rl.py	Adds CLI commands, evaluation logic, and plotting for results visualization.
lab2/resources/balance_new.py	Provides an alternative experimental script for balancing, using a Q-Learning approach with render mode.
lab2/resources/balance.py	Contains a legacy QLearner implementation for CartPole.
lab2/Pipfile	Updates dependencies and configuration, including new versions of required packages.

Files not reviewed (2)

lab2/.gitignore: Language not supported
lab2/results/cart_pole_qlearning_tuning.csv: Language not supported

lab2/rl.py

Copilot

Pull Request Overview

This pull request adds a reinforcement learning lab framework with hyperparameter tuning via Optuna and implementations for Q-Learning and SARSA. Key changes include:

Introducing the Tuner class in lab2/rl/tuner.py to manage hyperparameter optimization.
Adding the Learner base class and its derived implementations for CartPole and LunarLander in lab2/rl/learner.py.
Integrating tuning and evaluation workflows in lab2/rl.py and including resource scripts for balance simulations.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
lab2/rl/tuner.py	Added Tuner class with objective function and study setup
lab2/rl/learner.py	Introduced Learner base class and derived RL learners
lab2/rl.py	Added main entry point for tuning and evaluating
lab2/resources/balance_new.py	New resource script for balance simulation with render_mode
lab2/resources/balance.py	Alternate balance simulation resource using gym
lab2/Pipfile	Updated dependency file for the project

Files not reviewed (2)

lab2/.gitignore: Language not supported
lab2/results/cart_pole_qlearning_tuning.csv: Language not supported

Comments suppressed due to low confidence (3)

lab2/rl/tuner.py:14

[nitpick] The parameter name 'Learner' is capitalized, which might be confused with a class name. Consider renaming it to 'learner_class' for clarity.

def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...

lab2/rl.py:196

[nitpick] The variable name 'df' is later reused to store the discount factor from parameters, which may lead to confusion. Please consider using a more descriptive name (e.g., discount_factor) for the discount factor variables.

df = pd.DataFrame(all_rewards, columns=["attempt", "reward"])

lab2/resources/balance_new.py:4

[nitpick] There is an inconsistency in gym imports between 'gym' and the commented 'gymnasium' import. Aligning on a consistent gym API (or clarifying the intended use) would improve code clarity.

# import gymnasium as gym

Add generic interfaces for tuning and evaluation by refactoring the code. Add lunar lander environment.

Copilot

Pull Request Overview

This PR introduces a reinforcement learning laboratory with two primary learning algorithms (Q-Learning and SARSA) and integrates hyperparameter optimization using Optuna. Key changes include:

Implementation of a Tuner class for automatic hyperparameter tuning.
Enhancements and additions to the learner modules (for CartPole and LunarLander) along with auxiliary resource files.
A new main module to coordinate tuning and evaluation along with updates to dependency configuration.

Reviewed Changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
lab2/rl/tuner.py	New Tuner class implementing Optuna-based hyperparameter tuning for Q-Learning and SARSA.
lab2/rl/learner.py	Learner class and its concrete implementations for CartPole and LunarLander with Q-value update mechanisms.
lab2/rl.py	Main entry point integrating tuning and evaluation for the RL experiments.
lab2/resources/balance_new.py	A resource file introducing a QLearner variant using gym with render_mode, albeit with a placeholder discretise.
lab2/resources/balance.py	Another QLearner variant with a similar structure to balance_new.py.
lab2/Pipfile	Dependency and version management ensuring consistent environments.

Files not reviewed (2)

lab2/.gitignore: Language not supported
lab2/results/cart_pole_qlearning_tuning.csv: Language not supported

Comments suppressed due to low confidence (2)

lab2/rl/tuner.py:14

[nitpick] Consider renaming the parameter 'Learner' (and corresponding attribute) to 'learner_class' or similar to clearly indicate it expects a class type, thereby avoiding potential confusion with instance names.

def __init__(self, Learner: type[LearnerClass], environment: Literal["cart-pole", "lunar-lander"], ...)

lab2/resources/balance_new.py:3

[nitpick] The project generally uses 'gymnasium' in other modules; to maintain consistency, consider updating the import to 'gymnasium' (and adjust usage as needed) or unify the library choice across resource files.

import gym

Szaroslav added 4 commits April 19, 2025 22:26

Laboratory 2: Standard Q-Learning

d2d6b21

Laboratory 2: SARSA

81e6f04

Clean up

f34c12f

Rename 'b.py' to 'balance.py'

2290d49

Szaroslav added the enhancement New feature or request label Apr 21, 2025

Szaroslav requested a review from Copilot April 21, 2025 18:14

Szaroslav self-assigned this Apr 21, 2025

Copilot AI reviewed Apr 21, 2025

View reviewed changes

lab2/rl.py Outdated Show resolved Hide resolved

Szaroslav force-pushed the lab2-reinforcement-learning-qlearning-and-sarsa branch from dc9025e to 56cf0cf Compare April 21, 2025 18:17

Szaroslav requested a review from Copilot April 21, 2025 18:18

Copilot AI reviewed Apr 21, 2025

View reviewed changes

Szaroslav added 4 commits April 21, 2025 20:21

Laboratory 2: Add generic interfaces for tuning and evaluation

bd6c9b7

Add generic interfaces for tuning and evaluation by refactoring the code. Add lunar lander environment.

Laboratory 2: Update 'balance.py'

ff3216a

Laboratory 2: Remove 'balance.py'

4d72248

Laboratory 2: Create '.gitignore'

96f1940

Szaroslav force-pushed the lab2-reinforcement-learning-qlearning-and-sarsa branch from 56cf0cf to 96f1940 Compare April 21, 2025 18:25

Szaroslav requested a review from Copilot April 21, 2025 18:26

Copilot AI reviewed Apr 21, 2025

View reviewed changes

Szaroslav merged commit f7dd504 into main Apr 21, 2025

Szaroslav deleted the lab2-reinforcement-learning-qlearning-and-sarsa branch April 21, 2025 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

Szaroslav commented Apr 21, 2025

Copilot AI left a comment

Copilot AI left a comment

Copilot AI left a comment

Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

Laboratory 2: Reinforcement learning — Q-Learning and SARSA 🦾 #2

Conversation

Szaroslav commented Apr 21, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes