Skip to content

Using Burlap RL library templates for a more modern experience with burlap

Notifications You must be signed in to change notification settings

robododge/ml_burlap_templates

Repository files navigation

ML Java helper templates for use with Burlap

Implementes templates to use the Java-based Reienforcement Learning alogrithm's provied in the BURLAP libaray from Brown University.

Setup and run

  • This project uses java and gradle. Make sure you have a recent version of Java JDK installed (recommend JDK 15 or higher)
  • Install gradle
  • Run gradle build ./gradlew build

Run the demos

  • ./gradlew helloGridWorld Open the BURLAP GridWorld hellow world explorer, keys:
    • A-West, D-East, W-Up, S-Down
  • ./gradlew blockDudeViewer Run BURLAP's BlockDude, keys:
    • a - West, d - East, w - jump up
    • s - pickup, x - putdown
  • ./gradlew demoExperiment Runs the complete demo experiments in RunExperiments.java

Import into you IDE

  • Intellij - import new gradle project, select the root directory of this project
  • Eclipse - (no tested)

Create and run your experiments.

A sample experiment has been provided in RunExperiments.java Edit this file to setup various experiment sizes, current examples:

  1. Setup Large & Small GridWorldExperiments
  2. Setup the Level1 & Level2 BlockDude experiments

Also, three MDP solver alogorithms are provided:

  1. Value Iteraion Experiments (use the VISettings class to set hyperparametrs)
  2. Policy Iteration Experiments (use the PISettings class to set hyperparametrs)
  3. Q-Learning Experimnets (use the QSettings class to set hyperparametrs)

For running your experiments, you can just execute the main() of the RunExperiments.java class from your IDE.

Experiment output

A CSV writer is attached to each experiment, the output filename of each experiment is controlled by a "shortName" which is configured as part of your experiment type settings, PISettings, VISettings or QSettings. This short name will provide a filename prefix for each of the experiment runs.

Example file output output/smprob-24105858/blockdude/

Metrics Captured: Each experiment type has the ability to capture metrics collected during the iteraions of the experiments here is sample of metrics collected:

  1. "iter" - iteration id
  2. "delta" - delta value found at each iteration
  3. "wallclock" - wallclock time spent in each iteration, milliseconds for VI/PI, but nanosecond for QLearning
  4. "evals" - the number of VI evals done within a single policy step for PI
  5. "numSteps" - for QLearning, number of steps during last episode of learning

About

Using Burlap RL library templates for a more modern experience with burlap

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages