-
Notifications
You must be signed in to change notification settings - Fork 85
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into feat/add-sudoku-environment
- Loading branch information
Showing
45 changed files
with
4,825 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
::: jumanji.environments.logic.graph_coloring.env.GraphColoring | ||
selection: | ||
members: | ||
- __init__ | ||
- reset | ||
- step | ||
- observation_spec | ||
- action_spec |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
::: jumanji.environments.routing.robot_warehouse.env.RobotWarehouse | ||
selection: | ||
members: | ||
- __init__ | ||
- reset | ||
- step | ||
- observation_spec | ||
- action_spec |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Graph Coloring Environment | ||
|
||
<p align="center"> | ||
<img src="../env_img/graph_coloring.png" width="500"/> | ||
</p> | ||
|
||
We provide here a Jax JIT-able implementation of the Graph Coloring environment. | ||
|
||
Graph coloring is a combinatorial optimization problem where the objective is to assign a color to each vertex of a graph in such a way that no two adjacent vertices share the same color. The problem is usually formulated as minimizing the number of colors used. The `GraphColoring` environment is an episodic, single-agent setting that allows for the exploration of graph coloring algorithms and reinforcement learning methods. | ||
|
||
## Observation | ||
|
||
The observation in the `GraphColoring` environment includes information about the graph, the colors assigned to the vertices, the action mask, and the current node index. | ||
|
||
- `graph`: jax array (bool) of shape `(num_nodes, num_nodes)`, representing the adjacency matrix of the graph. | ||
- For example, a random observation of the graph adjacency matrix: | ||
|
||
```[[False, True, False, True], | ||
[ True, False, True, False], | ||
[False, True, False, True], | ||
[ True, False, True, False]]``` | ||
|
||
- `colors`: a JAX array (int32) of shape `(num_nodes,)`, representing the current color assignments for the vertices. Initially, all elements are set to -1, indicating that no colors have been assigned yet. | ||
- For example, an initial color assignment: | ||
```[-1, -1, -1, -1]``` | ||
|
||
- `action_mask`: a JAX array of boolean values, shaped `(num_colors,)`, which indicates the valid actions in the current state of the environment. Each position in the array corresponds to a color. True at a position signifies that the corresponding color can be used to color a node, while False indicates the opposite. | ||
- For example, for 4 number of colors available: | ||
```[True, False, True, False]``` | ||
|
||
- `current_node_index`: an integer representing the current node being colored. | ||
- For example, an initial current_node_index might be 0. | ||
|
||
## Action | ||
|
||
The action space is a DiscreteArray of integer values in `[0, 1, ..., num_colors - 1]`. Each action corresponds to assigning a color to the current node. | ||
|
||
## Reward | ||
|
||
The reward in the `GraphColoring` environment is given as follows: | ||
|
||
- `sparse reward`: a reward is provided at the end of the episode and equals the negative of the number of unique colors used to color all vertices in the graph. | ||
|
||
The agent's goal is to find a valid coloring using as few colors as possible while avoiding conflicts with adjacent nodes. | ||
|
||
## Episode Termination | ||
|
||
The goal of the agent is to find a valid coloring using as few colors as possible. An episode in the graph coloring environment can terminate under two conditions: | ||
|
||
1. All nodes have been assigned a color: the environment iteratively assigns colors to nodes. When all nodes have a color assigned (i.e., there are no nodes with a color value of -1), the episode ends. This is the natural termination condition and ideally the one we'd like the agent to achieve. | ||
|
||
2. Invalid action is taken: an action is considered invalid if it tries to assign a color to a node that is not within the allowed color set for that node at that time. The allowed color set for each node is updated after every action. If an invalid action is attempted, the episode immediately terminates and the agent receives a large negative reward. This encourages the agent to learn valid actions and discourages it from making invalid actions. | ||
|
||
## Registered Versions 📖 | ||
|
||
- `GraphColoring-v0`: The default settings for the `GraphColoring` problem with a configurable number of nodes and edge_probability. The default number of nodes is 20, and the default edge probability is 0.8. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# RobotWarehouse Environment | ||
|
||
<p align="center"> | ||
<img src="../env_anim/robot_warehouse.gif" width="600"/> | ||
</p> | ||
|
||
We provide a JAX jit-able implementation of the [Robotic Warehouse](https://github.com/semitable/robotic-warehouse/tree/master) | ||
environment. | ||
|
||
The Robot Warehouse (RWARE) environment simulates a warehouse with robots moving and delivering requested goods. Real-world applications inspire the simulator, in which robots pick up shelves and deliver them to a workstation. Humans access the content of a shelf, and then robots can return them to empty shelf locations. | ||
|
||
The goal is to successfully deliver as many requested shelves in a given time budget. | ||
|
||
Once a shelf has been delivered, a new shelf is requested at random. Agents start each episode at random locations within the warehouse. | ||
|
||
## Observation | ||
|
||
The **observation** seen by the agent is a `NamedTuple` containing the following: | ||
|
||
- `agents_view`: jax array (int32) of shape `(num_agents, num_obs_features)`, array representing the agent's view of other agents | ||
and shelves. | ||
|
||
- `action_mask`: jax array (bool) of shape `(num_agents, 5)`, array specifying, for each agent, | ||
which action (noop, forward, left, right, toggle_load) is legal. | ||
|
||
- `step_count`: jax array (int32) of shape `()`, number of steps elapsed in the current episode. | ||
|
||
## Action | ||
|
||
The action space is a `MultiDiscreteArray` containing an integer value in `[0, 1, 2, 3, 4]` for each | ||
agent. Each agent can take one of five actions: noop (`0`), forward (`1`), turn left (`2`), turn right (`3`), or toggle_load (`4`). | ||
|
||
The episode terminates under the following conditions: | ||
|
||
- An invalid action is taken, or | ||
|
||
- An agent collides with another agent. | ||
|
||
## Reward | ||
|
||
The reward is global and shared among the agents. It is equal to the number of shelves which were | ||
delivered successfully during the time step (i.e., +1 for each shelf). | ||
|
||
## Registered Versions 📖 | ||
|
||
- `RobotWarehouse-v0`, a warehouse with 4 agents each with a sensor range of 1, a warehouse floor with 2 shelf rows, 3 shelf columns, a column height of 8, and a shelf request queue of 8. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Copyright 2022 InstaDeep Ltd. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from jumanji.environments.logic.graph_coloring.env import GraphColoring | ||
from jumanji.environments.logic.graph_coloring.types import Observation, State |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Copyright 2022 InstaDeep Ltd. All rights reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import pytest | ||
|
||
from jumanji.environments.logic.graph_coloring import GraphColoring | ||
|
||
|
||
@pytest.fixture | ||
def graph_coloring() -> GraphColoring: | ||
"""Instantiates a default GraphColoring environment.""" | ||
return GraphColoring() |
Oops, something went wrong.