Skip to content

Commit

Permalink
[gym_jiminy/rllib] Full refactoring to support ray-rllib 2.38. (#832)
Browse files Browse the repository at this point in the history
* [python/viewer] Use firefox instead of chromium for offscreen rendering with meshcat to fix rendering on MacOS VM.
* [gym_jiminy/common] Do not permanently alter original simulation options with enabling debug and/or evaluation modes.
* [gym_jiminy/common] Disallow switch between evaluation and training mode when a simulation is running.
* [gym_jiminy/common] Rewrite binary log file automatically when calling 'BaseJiminyEnv.stop' in debug or evaluation mode.
* [gym_jiminy/common] Fix replay if no simulation is running.
* [gym_jiminy/common] Add previous action as input argument for evaluation policy callback.
* [gym_jiminy/common] Automatic environment pipeline update.
* [gym_jiminy/common] Fix composed reward computation.
* [gym_jiminy/common] Use metaclass instead of inheritence for abstract classes.
* [gym_jiminy/common] Enable typing of the obs and action spaces for 'gym.Env'.
* [gym_jiminy/common] Fix nested gym space helpers.
* [gym_jiminy/common] Update documentation.
* [gym_jiminy/toolbox] Add support of arbitrarily nested task-settable env.
* [gym_jiminy/envs] Add mirror mat to obs/action spaces.
* [gym_jiminy/rllib] Full refactoring to support ray-rllib 2.38.
* [misc] Move to macos-14 on Github Action (forcing panda3d tinydisplay driver).
  • Loading branch information
duburcqa authored Nov 18, 2024
1 parent 112595f commit e5c4897
Show file tree
Hide file tree
Showing 39 changed files with 2,672 additions and 1,996 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ jobs:
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v
- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
cd "${RootDir}/python/gym_jiminy/examples/rllib"
"${PYTHON_EXECUTABLE}" acrobot_ppo.py
Expand Down
23 changes: 18 additions & 5 deletions .github/workflows/macos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,7 @@ jobs:
strategy:
matrix:
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
OS: ['macos-13'] # 'macos-13': Intel (x86), 'macos-14+': Apple Silicon (arm64)
OS: ['macos-14'] # 'macos-13': Intel (x86), 'macos-14+': Apple Silicon (arm64)
PYTHON_VERSION: ['3.10', '3.11', '3.12'] # `setup-python` does not support Python<3.10 on Apple Silicon
BUILD_TYPE: ['Release']
include:
Expand Down Expand Up @@ -128,8 +127,10 @@ jobs:
# (see https://github.com/python/mypy/issues/17396)
"${PYTHON_EXECUTABLE}" -m pip install "numpy<2.0"
stubgen -p jiminy_py -o ${RootDir}/build/pypi/jiminy_py/src
# FIXME: Python 3.10 on Intel x86-64 crashes when generating stubs without any backtrace...
if [[ ("${{ matrix.OS }}" != 'macos-13') || ("${{ matrix.PYTHON_VERSION }}" != '3.10') ]] ; then
# FIXME: Python 3.10 crashes when generating stubs without any backtrace...
if [[ "${{ matrix.PYTHON_VERSION }}" != '3.10' ]] ; then
# lldb --batch -o "settings set target.process.stop-on-exec false" \
# -o "break set -n main" -o "run" -k "bt" -k "quit" -- \
"${PYTHON_EXECUTABLE}" "${RootDir}/build_tools/stubgen.py" \
-o ${RootDir}/build/stubs --ignore-invalid=all jiminy_py
cp ${RootDir}/build/stubs/jiminy_py-stubs/core/__init__.pyi \
Expand Down Expand Up @@ -183,12 +184,20 @@ jobs:
ctest --output-on-failure --test-dir "${RootDir}/build/core/unit"
cd "${RootDir}/python/jiminy_py/unit_py"
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
"${PYTHON_EXECUTABLE}" -m unittest discover -v
- name: Run unit tests for gym jiminy base module
run: |
export LD_LIBRARY_PATH="${InstallDir}/lib/:/usr/local/lib"
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
# FIXME: Disabling `test_pipeline_control.py` on MacOS because `test_pid_standing` is
# failing for 'panda3d-sync' backend due to meshes still loading at screenshot time.
if [[ "${{ matrix.BUILD_TYPE }}" == 'Debug' ]] ; then
Expand All @@ -197,11 +206,15 @@ jobs:
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v
- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
export LD_LIBRARY_PATH="${InstallDir}/lib/:/usr/local/lib"
cd "${RootDir}/python/gym_jiminy/examples/rllib"
# FIXME: Panda3d software rendering is partially broken on Apple Silicon
if [[ "${{ matrix.OS }}" != 'macos-13' ]] ; then
export JIMINY_PANDA3D_FORCE_TINYDISPLAY=
fi
"${PYTHON_EXECUTABLE}" acrobot_ppo.py
#########################################################################################
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/manylinux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ jobs:
-DBoost_NO_SYSTEM_PATHS=TRUE -DBoost_NO_BOOST_CMAKE=TRUE \
-DBoost_USE_STATIC_LIBS=ON -DPYTHON_EXECUTABLE="${PYTHON_EXECUTABLE}" \
-DBUILD_TESTING=ON -DBUILD_EXAMPLES=ON -DBUILD_PYTHON_INTERFACE=ON \
-DINSTALL_GYM_JIMINY=${{ (matrix.PYTHON_VERSION == 'cp312' && 'OFF') || 'ON' }} \
-DINSTALL_GYM_JIMINY=${{ (matrix.PYTHON_VERSION == 'cp313' && 'OFF') || 'ON' }} \
-DCMAKE_CXX_FLAGS="${CXX_FLAGS}" -DCMAKE_BUILD_TYPE="$BUILD_TYPE"
make -j4
Expand Down
13 changes: 6 additions & 7 deletions .github/workflows/ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -150,21 +150,20 @@ jobs:
fi
"${PYTHON_EXECUTABLE}" -m unittest discover -s "${RootDir}/python/gym_jiminy/unit_py" -v
# - name: Run examples for gym jiminy add-on modules
# if: matrix.BUILD_TYPE != 'Debug'
# run: |
# cd "${RootDir}/python/gym_jiminy/examples/rllib"
# "${PYTHON_EXECUTABLE}" acrobot_ppo.py
- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug'
run: |
cd "${RootDir}/python/gym_jiminy/examples/rllib"
"${PYTHON_EXECUTABLE}" acrobot_ppo.py
#####################################################################################

- name: Python linter
if: matrix.OS == 'ubuntu-24.04' && matrix.BUILD_TYPE != 'Debug' && matrix.COMPILER == 'gcc'
run: |
# FIXME: Add back "rllib"
cd "${RootDir}/python/jiminy_py/"
pylint --rcfile="${RootDir}/.pylintrc" "src/"
for name in "common" "toolbox" ; do
for name in "common" "toolbox" "rllib" ; do
cd "${RootDir}/python/gym_jiminy/$name"
pylint --rcfile="${RootDir}/.pylintrc" "gym_jiminy/"
done
Expand Down
5 changes: 1 addition & 4 deletions .github/workflows/win.yml
Original file line number Diff line number Diff line change
Expand Up @@ -201,13 +201,10 @@ jobs:
python -m unittest discover -s "$RootDir/python/gym_jiminy/unit_py" -v
- name: Run examples for gym jiminy add-on modules
if: matrix.BUILD_TYPE != 'Debug' && matrix.PYTHON_VERSION != '3.11' && matrix.PYTHON_VERSION != '3.12'
if: matrix.BUILD_TYPE != 'Debug'
run: |
$RootDir = "${env:GITHUB_WORKSPACE}/workspace" -replace '\\', '/'
# FIXME: Python 3.11 was not supported by ray on Windows until very recently.
# It has been fixed on master but not on the latest available release (2.93).
# See: https://github.com/ray-project/ray/pull/42097
Set-Location -Path "$RootDir/python/gym_jiminy/examples/rllib"
python acrobot_ppo.py
Expand Down
1 change: 1 addition & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ disable =
abstract-method,
protected-access,
useless-parent-delegation,
unbalanced-tuple-unpacking,
use-dict-literal,
unspecified-encoding,
undefined-loop-variable,
Expand Down
4 changes: 2 additions & 2 deletions python/gym_jiminy/common/gym_jiminy/common/bases/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
- the base controller block
- the base observer block
"""
from abc import abstractmethod, ABC
from abc import abstractmethod, ABCMeta
from typing import Any, Union, Generic, TypeVar, cast

import gymnasium as gym
Expand All @@ -26,7 +26,7 @@
BlockState = TypeVar('BlockState', bound=Union[DataNested, None])


class InterfaceBlock(ABC, Generic[BlockState, BaseObs, BaseAct]):
class InterfaceBlock(Generic[BlockState, BaseObs, BaseAct], metaclass=ABCMeta):
"""Base class for blocks used for pipeline control design. Blocks can be
either observers and controllers.
Expand Down
10 changes: 5 additions & 5 deletions python/gym_jiminy/common/gym_jiminy/common/bases/compositions.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
This modular approach allows for standardization of usual metrics. Overall, it
greatly reduces code duplication and bugs.
"""
from abc import ABC, abstractmethod
from abc import abstractmethod, ABCMeta
from enum import IntEnum
from typing import Tuple, Sequence, Callable, Union, Optional, Generic, TypeVar

Expand All @@ -23,7 +23,7 @@
ArrayLikeOrScalar = Union[ArrayOrScalar, Sequence[Union[Number, np.number]]]


class AbstractReward(ABC):
class AbstractReward(metaclass=ABCMeta):
"""Abstract class from which all reward component must derived.
This goal of the agent is to maximize the expectation of the cumulative sum
Expand All @@ -32,7 +32,7 @@ class AbstractReward(ABC):
indefinite (aka. objective).
Defining cost is allowed by not recommended. Although it encourages the
agent to achieve the task at hands as quickly as possible if success is the
agent to achieve the task at hand as quickly as possible if success is the
only termination condition, it has the side-effect to give the opportunity
to the agent to maximize the return by killing itself whenever this is an
option, which is rarely the desired behavior. No restriction is enforced as
Expand Down Expand Up @@ -400,7 +400,7 @@ class EpisodeState(IntEnum):
"""


class AbstractTerminationCondition(ABC):
class AbstractTerminationCondition(metaclass=ABCMeta):
"""Abstract class from which all termination conditions must derived.
Request the ongoing episode to stop immediately as soon as a termination
Expand Down Expand Up @@ -470,7 +470,7 @@ def name(self) -> str:

@abstractmethod
def compute(self, info: InfoType) -> bool:
"""Evaluate the termination condition at hands.
"""Evaluate the termination condition at hand.
:param info: Dictionary of extra information for monitoring. It will be
updated in-place for storing terminated and truncated
Expand Down
47 changes: 37 additions & 10 deletions python/gym_jiminy/common/gym_jiminy/common/bases/interfaces.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
specifically design for Jiminy engine, and defined as mixin classes. Any
observer/controller block must inherit and implement those interfaces.
"""
from abc import abstractmethod, ABC
from abc import abstractmethod, ABCMeta
from collections import OrderedDict
from typing import (
Dict, Any, Tuple, TypeVar, Generic, TypedDict, no_type_check,
Dict, Any, Tuple, TypeVar, Generic, TypedDict, Optional, no_type_check,
TYPE_CHECKING)

import numpy as np
Expand Down Expand Up @@ -53,11 +53,11 @@ class EngineObsType(TypedDict):
"""


class InterfaceObserver(ABC, Generic[Obs, BaseObs]):
class InterfaceObserver(Generic[Obs, BaseObs], metaclass=ABCMeta):
"""Observer interface for both observers and environments.
"""
observe_dt: float = -1
observation_space: gym.Space # [Obs]
observation_space: gym.Space[Obs]
observation: Obs

def __init__(self, *args: Any, **kwargs: Any) -> None:
Expand Down Expand Up @@ -96,11 +96,11 @@ def refresh_observation(self, measurement: BaseObs) -> None:
"""


class InterfaceController(ABC, Generic[Act, BaseAct]):
class InterfaceController(Generic[Act, BaseAct], metaclass=ABCMeta):
"""Controller interface for both controllers and environments.
"""
control_dt: float = -1
action_space: gym.Space # [Act]
action_space: gym.Space[Act]

def __init__(self, *args: Any, **kwargs: Any) -> None:
"""Initialize the controller interface.
Expand Down Expand Up @@ -164,9 +164,9 @@ def compute_reward(self,
return 0.0


# Note that `InterfaceJiminyEnv` must inherit from `InterfaceObserver`
# before `InterfaceController` to initialize the action space before the
# observation space since the action itself may be part of the observation.
# Note that `InterfaceJiminyEnv` must inherit from `InterfaceObserver` before
# `InterfaceController` to initialize the action space before the observation
# space since the action itself may be part of the observation.
# Similarly, `gym.Env` must be last to make sure all the other initialization
# methods are called first.
class InterfaceJiminyEnv(
Expand All @@ -183,6 +183,11 @@ class InterfaceJiminyEnv(
['rgb_array'] + (['human'] if is_display_available() else []))
}

# FIXME: Re-definition in derived class to stop mypy from complaining about
# incompatible types between the multiple base classes.
action_space: gym.Space[Act]
observation_space: gym.Space[Obs]

simulator: Simulator
robot: jiminy.Robot
stepper_state: jiminy.StepperState
Expand Down Expand Up @@ -341,7 +346,7 @@ def _controller_handle(self,
self.__is_observation_refreshed = False

def stop(self) -> None:
"""Stop the episode immediately without waiting for a termination or
"""Stop the episode immediately, without waiting for a termination or
truncation condition to be satisfied.
.. note::
Expand All @@ -351,9 +356,31 @@ def stop(self) -> None:
data will not be available during replay using object-oriented
method `replay`. Helper method `play_logs_data` must be preferred
to replay an episode that cannot be stopped at the time being.
.. warning:
This method is never called internally by the engine.
"""
self.simulator.stop()

@abstractmethod
def update_pipeline(self, derived: Optional["InterfaceJiminyEnv"]) -> None:
"""Dynamically update which blocks are declared as part of the
environment pipeline.
Internally, this method first unregister all blocks of the old
pipeline, then register all blocks of the new pipeline, and finally
notify the base environment that the top-most block of the pipeline as
changed and must be updated accordingly.
.. warning::
This method is not supposed to be called manually nor overloaded.
:param derived: Either the top-most block of the pipeline or None.
If None, unregister all blocks of the old pipeline. If
not None, first unregister all blocks of the old
pipeline, then register all blocks of the new pipeline.
"""

@abstractmethod
def has_terminated(self, info: InfoType) -> Tuple[bool, bool]:
"""Determine whether the episode is over, because a terminal state of
Expand Down
Loading

0 comments on commit e5c4897

Please sign in to comment.