Skip to content

Latest commit

 

History

History
77 lines (64 loc) · 2.75 KB

README.md

File metadata and controls

77 lines (64 loc) · 2.75 KB

Pref-RL

Pref-RL provides ready-to-use PbRL agents that are easily extensible.

We strive for:

  • Training of state-of-the-art PbRL agents on arbitrary environments in a few lines of code.
  • An easily extensible agent framework to quickly build your own custom agents on top.
  • A clean and well-maintained implementation (in Python).

Main features (planned)

Note: The project is still in an experimental development phase. The initial feature set is not yet completed and no performance tests have been conducted.

General

  • Simple training of deep PbRL agents on arbitrary Gym environments
  • FNN and CNN reward models (implemented in PyTorch)
  • Synthetic preference data generation
  • Human preference data generation / collection (under development)
  • State-of-the-art RL algorithms (via Stable Baselines3)
  • TensorBoard support

Bould your own agents

  • Custom environments (Open AI Gym compatible)
  • Custom reward models
  • Custom PbRL agents with almost no code
  • Easy integration of custom components

Code quality

  • High code coverage (> 90%)
  • PEP8 code style
  • Type hints
  • Learning performance benchmarked against state-of-the-art

Other features

  • Active, ensemble-based query selection
  • Advanced reward model pretraining with IRL, intrinsic motivation, ...
  • PEBBLE PbRL algorithm

Installation

These instructions presume a *nix or OS X operating system.

Prerequisites

This framework requires Python 3.6+ and pip.

Install pip with these installation instructions.

Install using pip

Install the requirements using pip:

pip install -r requirements.txt

Windows

On Windows, you may encounter issues running OpenAI Gym Atari environments. This stack overflow answer could help.

Example

See teach.py for an example of how to instantiate and run an agent. When you're at the project's root, use the following to run a default agent from the command line:

python teach.py --env_id "CartPole-v1" --reward_model "Mlp" --num_rl_timesteps 200000 --num_pretraining_preferences 100

You can monitor agent training with TensorBoard. Start it with:

tensorboard --logdir=runs

View the output by navigating to https://localhost:6006.

Testing the implementation

All unit tests in the framework can be run using pytest. It is part of the project's requirements and has therefore already been installed. Run the tests with:

pytest ./tests/