docs: Further improvements

strakam · Oct 14, 2024 · 30a41ef · 30a41ef
1 parent 6b59a7a
commit 30a41ef
Showing 1 changed file with 49 additions and 47 deletions.
diff --git a/README.md b/README.md
@@ -18,19 +18,21 @@ While the goal is simple — capture the enemy general — the gameplay combines
 challenging players to balance micro and macro-level decision-making.
 The combination of these elements makes the game highly engaging and complex.
 
-This repository is designed to make bot development for [Generals.io]((https://generals.io)) easier,
-particularly for Machine Learning agents.
-
 Highlights:
-* 🚀 **blazing-fast, lightweight simulator**: run thousands of steps per second with numpy-powered efficiency
+* 🚀 **blazing-fast simulator**: run thousands of steps per second with numpy-powered efficiency
 * 🤝 **seamless integration**: fully compatible with RL standards 🤸[Gymnasium](https://gymnasium.farama.org/) and 🦁[PettingZoo](https://pettingzoo.farama.org/)
 * 🔧 **effortless customization**: easily tailor environments to your specific needs
 * 🔬 **analysis tools**: leverage features like replays for deeper insights
+
+> [!Note]
+> This repository is based on the generals.io game. Its one and only goal is to provide a bot development
+> platform, particularly for Machine-Learning agents.
+
 > [!Tip]
-> This repository is based on the [generals.io](https://generals.io) game. Check it out, it is a lot of fun!
+> Try out the [original game](https://generals.io), its a lot of fun!
 
 ## 📦 Installation
-You can install the latest stable version via pip for reliable release
+You can install the latest stable version via `pip` for reliable performance
 ```bash
 pip install generals
 ```
@@ -41,49 +43,38 @@ cd Generals-RL
 pip install -e .
 ```
 
-## Usage example (🤸 Gymnasium)
-
+## 🚀 Getting Started
+We create agents in two modes:
+- **classic mode** - this mode is intended for agents based on classical algorithms and heuristics.
+  In this mode you can start by subclassing an `Agent` class just like [`RandomAgent`](./generals/agents/random_agent.py) or [`ExpanderAgent`](./generals/agents/expander_agent.py).
+  You can specify your agent `id` (name) and `color` and the only thing remaining is to implement the `act` function, that has the signature explained in sections down below.
+- **learning mode** - this mode is more "open" and we expect that you have some experience with **Gymnasium** or **PettingZoo**.
+
+### Usage Example (🤸 Gymnasium)
+In both modes, the example loop for running the game looks like this
 ```python
 import gymnasium as gym
 from generals import AgentFactory
 
-# Initialize agents
-agent = AgentFactory.make_agent("expander")
-npc = AgentFactory.make_agent("random")
+# Initialize opponent agent
+npc = AgentFactory.make_agent("expander")
 
-env = gym.make(
-    "gym-generals-v0",
-    agent=agent,
-    npc=npc,
-    render_mode="human",
-)
+# Create environment
+env = gym.make("gym-generals-v0", npc=npc, render_mode="human")
 
 observation, info = env.reset()
-
 terminated = truncated = False
 while not (terminated or truncated):
-    action = agent.act(observation)
+    action = env.action_space.sample()  # Here you put an action of your agent
     observation, reward, terminated, truncated, info = env.step(action)
     env.render()
 ```
-You can also check an example for 🦁[PettingZoo](./examples/pettingzoo_example.py) or
-an example with commentary showcasing various features [here](./examples/complete_example.py).
-
-## 🚀 Getting Started
-Creating your first agent is very simple.
-- Start by subclassing an `Agent` class just like [`RandomAgent`](./generals/agents/random_agent.py) or [`ExpanderAgent`](./generals/agents/expander_agent.py).
-- Every agent must have a name as it is his ID by which he is called for actions.
-- Every agent must implement `play(observation)` function that takes in `observation` and returns an `action` (both defined below).
-- You can start by copying the [Usage Example](#usage-example--gymnasium) and replacing `agent` with your implementation.
-- When creating an environment, you can choose out of two `render_modes`:
-     - `None` that omits rendering and is suitable for training,
-     - `"human"` where you can see the game play out.
 
 > [!TIP]
-> Check out `Makefile` and run some examples to get a feel for the game 🤗.
+> Check out [Wiki](https://github.com/strakam/Generals-RL/wiki) for more commented examples to get a better idea on how to start 🤗.
 
-## 🎨 Custom grids
-Grids are generated via `GridFactory`. You can instantiate the class with desired grid properties, and it will generate
+## 🎨 Custom Grids
+Grids on which the game is played on are generated via `GridFactory`. You can instantiate the class with desired grid properties, and it will generate
 grid with these properties for each run.
 ```python
 import gymnasium as gym
@@ -120,14 +111,14 @@ options = {"grid": grid}
 # Pass the new grid to the environment (for the next game)
 env.reset(options=options)
 ```
-Grids are encoded using these symbols:
-- `.` for cells where you can move your army
-- `#` for mountains (terrain that can not be passed)
-- `A,B` are positions of generals
-- digits `0-9` represent cities with the cost calculated as `(40 + digit)`
-
-## 🔬 Replays
-We can store replays and then analyze them. `Replay` class handles replay related functionality.
+Grids are created using a string format where:
+- `.` represents passable terrain
+- `#` indicates impassable mountains
+- `A, B` mark the positions of generals
+- digits `0-9` represent cities, where the number is a city `cost` calculated as `40 + digit`
+
+## 🔬 Interactive Replays
+We can store replays and then analyze them in an interactive fashion. `Replay` class handles replay related functionality.
 ### Storing a replay
 ```python
 import gymnasium as gym
@@ -148,6 +139,7 @@ replay = Replay.load("my_replay")
 replay.play()
 ```
 ### 🕹️ Replay controls
+You can control your replays to your liking! Currently we support these controls:
 - `q` — quit/close the replay
 - `r` — restart replay from the beginning
 - `←/→` — increase/decrease the replay speed
@@ -181,14 +173,15 @@ The `observation` is a `Dict`. Values are either `numpy` matrices with shape `(N
 | `is_winner`          |     —     | Indicates whether the agent won                                              |
 | `timestep`           |     —     | Current timestep of the game                                                 |
 
-`action_mask` is a mask with shape `(N,M,4)` where value `[i,j,d]` says whether you can move from cell `[i,j]` in a direction `d`.
+The `action_mask` is a 3D array with shape `(N, M, 4)`, where each element corresponds to whether a move is valid from cell 
+`[i, j]` in one of four directions: `0 (up)`, `1 (down)`, `2 (left)`, or `3 (right)`.
 
 ### ⚡ Action
-Action is a `tuple(pass, cell, direction, split)`, where:
+Actions are in a `dict` format with the following `key - value` format:
 - `pass` indicates whether you want to `1 (pass)` or `0 (play)`.
-- `cell` is an `np.array([i,j])` where `i,j` are indices of the cell you want to move from
-- `direction` indicates whether you want to move `0 (up)`, `1 (down)`, `2 (left)`, or `3 (right)`
-- `split` indicates whether you want to `1 (split)` units (send half of them) or `0 (no split)`, which sends all possible units to the next cell.
+- `cell` chooses from which cell you want to move - value: an `np.array([i,j])` where `i,j` are indices of the cell you want to move from
+- `direction` indicates wehther you want to move `0 (up)`, `1 (down)`, `2 (left)`, or `3 (right)`
+- `split` indicates whether you want to `1 (split)` units and send only half, or `0 (no split)` where you send all units to the next cell
 
 > [!TIP]
 > You can see how actions and observations look like by printing a sample form the environment:
@@ -207,3 +200,12 @@ def custom_reward_fn(observation, action, done, info):
 env = gym.make(..., reward_fn=custom_reward_fn)
 observations, info = env.reset()
 ```
+
+## 🌱 Contributing
+You can contribute to this project in multiple ways:
+- 🤖 If you implement ANY non-trivial agent, send it to us! We will publish it so others can play against it
+- 💡 If you have an idea on how to improve the game, submit an issue or create a PR, we are happy to improve!
+  We also have some ideas (see [issues](https://github.com/strakam/Generals-RL/issues)), so you can see what we plan to work on
+
+> [!Tip]
+> Check out [wiki](https://github.com/strakam/Generals-RL/wiki) to learn in more detail on how to contribute.