Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(robot_warehouse): full environment #140

Merged
merged 63 commits into from
Jun 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
9884f36
feat: initial env code
arnupretorius May 16, 2023
51aaf16
chore: update imports
arnupretorius May 18, 2023
07f0101
feat: random agent
arnupretorius May 19, 2023
b7a19f8
feat: a2c network
arnupretorius May 19, 2023
3aa8d91
fix: formatting
arnupretorius May 19, 2023
75fa605
fix: type checker
arnupretorius May 19, 2023
174f831
fix: spec test
arnupretorius May 19, 2023
8cd6c2c
feat: docs
arnupretorius May 19, 2023
c1efe73
fix: small typo
arnupretorius May 19, 2023
6c6278a
Merge branch 'main' into 136-rware-env
sash-a May 24, 2023
3bd904e
Merge branch 'main' into 136-rware-env
sash-a May 24, 2023
4014a7d
feat: remove truncation in favour of termination
sash-a May 26, 2023
d52a14f
feat: more layers in network
sash-a May 26, 2023
f590351
feat: transformers for rware
sash-a May 26, 2023
0471380
feat: sum embeddings in actor + larger final layer
sash-a May 29, 2023
c909ddb
fix: critic net
sash-a May 29, 2023
27d4e62
Update docs/environments/rware.md
arnupretorius May 29, 2023
8341506
Update jumanji/environments/routing/rware/env_test.py
arnupretorius May 29, 2023
63d9265
Update jumanji/environments/routing/rware/env_test.py
arnupretorius May 29, 2023
a409a4e
Update jumanji/training/configs/config.yaml
arnupretorius May 29, 2023
632a59d
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
1510066
Update jumanji/environments/routing/rware/env_test.py
arnupretorius May 29, 2023
a29ab83
Update jumanji/environments/routing/rware/generator.py
arnupretorius May 29, 2023
20e388f
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
605cfa5
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
e915991
Update jumanji/training/configs/config.yaml
arnupretorius May 29, 2023
f0dec4a
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
98f5cdb
Update jumanji/environments/routing/rware/generator.py
arnupretorius May 29, 2023
468c754
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
90f659b
Update jumanji/environments/routing/rware/utils.py
arnupretorius May 29, 2023
5d38982
Merge branch 'main' into 136-rware-env
arnupretorius May 29, 2023
02aa09c
fix: pre commit checks
arnupretorius May 29, 2023
c7dde6c
fix: test for termination/truncation issue
arnupretorius May 29, 2023
5e6fb2d
fix: remove unused policy and value layers
sash-a May 30, 2023
1071ac7
fix: remove policy/value layers from setup_train and smaller lr
sash-a May 30, 2023
90c4095
feat: new default generator
sash-a May 30, 2023
243e058
chore: renames rware to robotic warehouse
arnupretorius May 30, 2023
924a60d
fix: update docs with new name
arnupretorius May 30, 2023
8187d7d
chore: refactor generator
arnupretorius May 30, 2023
201c694
chore: remove unnecessary key split
arnupretorius May 30, 2023
1966b58
feat: constants file
arnupretorius May 30, 2023
669c7e7
fix: linters
arnupretorius May 30, 2023
bd2f01e
refactor: split utils into separate files
arnupretorius May 30, 2023
abaef2b
fix: mypy
arnupretorius May 30, 2023
f85b305
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
d449cb4
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
8a3cd53
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
02ea2b2
Update jumanji/__init__.py
arnupretorius Jun 1, 2023
0d378f1
Update docs/environments/robot_warehouse.md
arnupretorius Jun 1, 2023
c7502f2
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
8688bb0
Update jumanji/environments/routing/robot_warehouse/generator.py
arnupretorius Jun 1, 2023
113a3f3
Update jumanji/environments/routing/robot_warehouse/generator.py
arnupretorius Jun 1, 2023
a27cc9f
Update jumanji/environments/routing/robot_warehouse/generator.py
arnupretorius Jun 1, 2023
399d204
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
9811268
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
5ae26e7
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
2396f70
Update jumanji/environments/routing/robot_warehouse/env.py
arnupretorius Jun 1, 2023
55a7980
Update jumanji/environments/routing/robot_warehouse/generator.py
arnupretorius Jun 1, 2023
220deeb
fix: syntax
arnupretorius Jun 1, 2023
97d243e
Update jumanji/training/networks/robot_warehouse/actor_critic.py
arnupretorius Jun 1, 2023
7632aab
Update jumanji/training/networks/robot_warehouse/actor_critic.py
arnupretorius Jun 1, 2023
796c28a
fix: add abstract properties to generator
arnupretorius Jun 1, 2023
70175be
Merge branch 'main' into 136-rware-env
clement-bonnet Jun 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 6 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
| [**Docs**](https://instadeepai.github.io/jumanji)
---


<p float="left" align="center">
<img src="docs/env_anim/connector.gif" alt="Connector" width="30%" />
<img src="docs/env_anim/snake.gif" alt="Snake" width="30%" />
Expand All @@ -32,8 +31,6 @@
<img src="docs/env_anim/minesweeper.gif" alt="Minesweeper" width="30%" />
</p>



## Welcome to the Jungle! 🌴

Jumanji is a suite of diverse and challenging reinforcement learning (RL) environments written in
Expand Down Expand Up @@ -70,7 +67,6 @@ JAX-based environments.
- 🏎️ **Training:** example agents that can be used as inspiration for the agents one may implement
in their research.


## Environments 🌍

Jumanji provides a diverse range of environments ranging from simple games to NP-hard combinatorial
Expand All @@ -88,20 +84,24 @@ problems.
| :link: Connector | Routing | `Connector-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/connector/) | [doc](https://instadeepai.github.io/jumanji/environments/connector/) |
| 🚚 CVRP (Capacitated Vehicle Routing Problem) | Routing | `CVRP-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/cvrp/) | [doc](https://instadeepai.github.io/jumanji/environments/cvrp/) |
| :mag: Maze | Routing | `Maze-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/maze/) | [doc](https://instadeepai.github.io/jumanji/environments/maze/) |
| :robot: RobotWarehouse | Routing | `RobotWarehouse-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/robot_warehouse/) | [doc](https://instadeepai.github.io/jumanji/environments/robot_warehouse/) |
| 🐍 Snake | Routing | `Snake-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/snake/) | [doc](https://instadeepai.github.io/jumanji/environments/snake/) |
| 📬 TSP (Travelling Salesman Problem) | Routing | `TSP-v1` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/tsp/) | [doc](https://instadeepai.github.io/jumanji/environments/tsp/) |


## Installation 🎬

You can install the latest release of Jumanji from PyPI:

```bash
pip install jumanji
```

Alternatively, you can install the latest development version directly from GitHub:

```bash
pip install git+https://github.com/instadeepai/jumanji.git
```

Jumanji has been tested on Python 3.8 and 3.9.
Note that because the installation of JAX differs depending on your hardware accelerator,
we advise users to explicitly install the correct JAX version (see the
Expand All @@ -113,7 +113,6 @@ you will need a GUI backend. For example, on Linux, you can install Tk via:
[Matplotlib backends](https://matplotlib.org/stable/users/explain/backends.html) for a list of
backends you can use.


## Quickstart ⚡

RL practitioners will find Jumanji's interface familiar as it combines the widely adopted
Expand Down Expand Up @@ -170,7 +169,6 @@ the version number is incremented by one to prevent potential confusion.
For a full list of registered versions of each environment, check out
[the documentation](https://instadeepai.github.io/jumanji/environments/tsp/).


## Training 🏎️

To showcase how to train RL agents on Jumanji environments, we provide a random agent and a vanilla
Expand All @@ -191,18 +189,17 @@ actor-critic networks in
For more information on how to use the example agents, see the
[training guide](https://instadeepai.github.io/jumanji/guides/training/).


## Contributing 🤝

Contributions are welcome! See our issue tracker for
[good first issues](https://github.com/instadeepai/jumanji/labels/good%20first%20issue). Please read
our [contributing guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) for
details on how to submit pull requests, our Contributor License Agreement, and community guidelines.


## Citing Jumanji ✏️

If you use Jumanji in your work, please cite the library using:

```
@software{jumanji2023github,
author = {Clément Bonnet and Daniel Luo and Donal Byrne and Sasha Abramowitz
Expand All @@ -216,7 +213,6 @@ If you use Jumanji in your work, please cite the library using:
}
```


## See Also 🔎

Other works have embraced the approach of writing RL environments in JAX.
Expand Down
8 changes: 8 additions & 0 deletions docs/api/environments/rware.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
::: jumanji.environments.routing.robot_warehouse.env.RobotWarehouse
selection:
members:
- __init__
- reset
- step
- observation_spec
- action_spec
Binary file added docs/env_anim/rware.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/env_img/rware.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
46 changes: 46 additions & 0 deletions docs/environments/robot_warehouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# RobotWarehouse Environment

<p align="center">
<img src="../env_anim/robot_warehouse.gif" width="600"/>
</p>

We provide a JAX jit-able implementation of the [Robotic Warehouse](https://github.com/semitable/robotic-warehouse/tree/master)
environment.

The Robot Warehouse (RWARE) environment simulates a warehouse with robots moving and delivering requested goods. Real-world applications inspire the simulator, in which robots pick up shelves and deliver them to a workstation. Humans access the content of a shelf, and then robots can return them to empty shelf locations.

The goal is to successfully deliver as many requested shelves in a given time budget.

Once a shelf has been delivered, a new shelf is requested at random. Agents start each episode at random locations within the warehouse.

## Observation

The **observation** seen by the agent is a `NamedTuple` containing the following:

- `agents_view`: jax array (int32) of shape `(num_agents, num_obs_features)`, array representing the agent's view of other agents
and shelves.

- `action_mask`: jax array (bool) of shape `(num_agents, 5)`, array specifying, for each agent,
which action (noop, forward, left, right, toggle_load) is legal.

- `step_count`: jax array (int32) of shape `()`, number of steps elapsed in the current episode.

## Action

The action space is a `MultiDiscreteArray` containing an integer value in `[0, 1, 2, 3, 4]` for each
agent. Each agent can take one of five actions: noop (`0`), forward (`1`), turn left (`2`), turn right (`3`), or toggle_load (`4`).

The episode terminates under the following conditions:

- An invalid action is taken, or

- An agent collides with another agent.

## Reward

The reward is global and shared among the agents. It is equal to the number of shelves which were
delivered successfully during the time step (i.e., +1 for each shelf).

## Registered Versions 📖

- `RobotWarehouse-v0`, a warehouse with 4 agents each with a sensor range of 1, a warehouse floor with 2 shelf rows, 3 shelf columns, a column height of 8, and a shelf request queue of 8.
4 changes: 4 additions & 0 deletions jumanji/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@
# Maze with 10 rows and 10 columns, a time limit of 100 and a random maze generator.
register(id="Maze-v0", entry_point="jumanji.environments:Maze")

# RobotWarehouse with a random generator with 2 shelf rows, 3 shelf columns, a column height of 8,
# 4 agents, a sensor range of 1, and a request queue of size 8.
register(id="RobotWarehouse-v0", entry_point="jumanji.environments:RobotWarehouse")

# Snake game on a board of size 12x12 with a time limit of 4000.
register(id="Snake-v1", entry_point="jumanji.environments:Snake")

Expand Down
11 changes: 10 additions & 1 deletion jumanji/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,20 @@
from jumanji.environments.packing.bin_pack.env import BinPack
from jumanji.environments.packing.job_shop.env import JobShop
from jumanji.environments.packing.knapsack.env import Knapsack
from jumanji.environments.routing import cleaner, connector, cvrp, maze, snake, tsp
from jumanji.environments.routing import (
cleaner,
connector,
cvrp,
maze,
robot_warehouse,
snake,
tsp,
)
from jumanji.environments.routing.cleaner.env import Cleaner
from jumanji.environments.routing.connector.env import Connector
from jumanji.environments.routing.cvrp.env import CVRP
from jumanji.environments.routing.maze.env import Maze
from jumanji.environments.routing.robot_warehouse.env import RobotWarehouse
from jumanji.environments.routing.snake.env import Snake
from jumanji.environments.routing.tsp.env import TSP

Expand Down
16 changes: 16 additions & 0 deletions jumanji/environments/routing/robot_warehouse/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from jumanji.environments.routing.robot_warehouse.env import RobotWarehouse
from jumanji.environments.routing.robot_warehouse.types import Observation, State
99 changes: 99 additions & 0 deletions jumanji/environments/routing/robot_warehouse/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from typing import Tuple

import jax
import jax.numpy as jnp
import pytest

from jumanji.environments.routing.robot_warehouse import RobotWarehouse
from jumanji.environments.routing.robot_warehouse.generator import RandomGenerator
from jumanji.environments.routing.robot_warehouse.types import (
Agent,
Position,
Shelf,
State,
)
from jumanji.types import TimeStep


@pytest.fixture(scope="module")
def robot_warehouse_env() -> RobotWarehouse:
"""Instantiates a default RobotWarehouse environment with 2 agents, 1 shelf row, 3 shelf columns,
a column height of 2, sensor range of 1 and a request queue size of 4."""
generator = RandomGenerator(
shelf_rows=1,
shelf_columns=3,
column_height=2,
num_agents=2,
sensor_range=1,
request_queue_size=4,
)

env = RobotWarehouse(
generator=generator,
time_limit=5,
)
return env


@pytest.fixture
def deterministic_robot_warehouse_env(
robot_warehouse_env: RobotWarehouse,
) -> Tuple[RobotWarehouse, State, TimeStep]:
"""Instantiates a RobotWarehouse environment with 2 agents and 8 shelves
with a step limit of 5."""
state, timestep = robot_warehouse_env.reset(jax.random.PRNGKey(42))

# create agents, shelves and grid
def make_agent(x: int, y: int, direction: int, is_carrying: int) -> Agent:
return Agent(Position(x=x, y=y), direction=direction, is_carrying=is_carrying)

def make_shelf(x: int, y: int, is_requested: int) -> Shelf:
return Shelf(Position(x=x, y=y), is_requested=is_requested)

# agent information
xs = jnp.array([3, 1])
ys = jnp.array([4, 7])
dirs = jnp.array([2, 3])
carries = jnp.array([0, 0])
state.agents = jax.vmap(make_agent)(xs, ys, dirs, carries)

# shelf information
xs = jnp.array([1, 1, 1, 1, 2, 2, 2, 2])
ys = jnp.array([1, 2, 7, 8, 1, 2, 7, 8])
requested = jnp.array([0, 1, 1, 0, 0, 0, 1, 1])
state.shelves = jax.vmap(make_shelf)(xs, ys, requested)

# create grid
state.grid = jnp.array(
[
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 2, 0, 0, 0, 0, 3, 4, 0],
[0, 5, 6, 0, 0, 0, 0, 7, 8, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
],
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 2, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
],
]
)
return robot_warehouse_env, state, timestep
37 changes: 37 additions & 0 deletions jumanji/environments/routing/robot_warehouse/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Copyright 2022 InstaDeep Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import jax.numpy as jnp

from jumanji.environments.routing.robot_warehouse.types import Direction

# grid channels
_SHELVES = 0
_AGENTS = 1

# agent directions
_POSSIBLE_DIRECTIONS = jnp.array([d.value for d in Direction])

# viewer constants
_FIGURE_SIZE = (5, 5)
_SHELF_PADDING = 2

# colors
_GRID_COLOR = (0, 0, 0) # black
_SHELF_COLOR = (72 / 255.0, 61 / 255.0, 139 / 255.0) # dark slate blue
_SHELF_REQ_COLOR = (0, 128 / 255.0, 128 / 255.0) # teal
_AGENT_COLOR = (1, 140 / 255.0, 0) # dark orange
_AGENT_LOADED_COLOR = (1, 0, 0) # red
_AGENT_DIR_COLOR = (0, 0, 0) # black
_GOAL_COLOR = (60 / 255.0, 60 / 255.0, 60 / 255.0)
Loading