Unverified Commit cd2c4f1d authored by Mayank Mittal's avatar Mayank Mittal Committed by GitHub

Upgrades environments from Gym 0.21 to Gymnasium 0.29 (#234)

# Description

Currently, we are downgrading many libraries to be able to use the Gym
0.21.0 version. However, this is not great and is causing issues
installing new Python packages, as highlighted in #204. It is becoming a
more significant issue with Python 3.10 in Isaac Sim 2023.1.

This MR upgrades the repository to use the Gymnasium Environment class.

## Type of Change

- Bug fix (non-breaking change which fixes an issue)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------
Signed-off-by: 's avatarMayank Mittal <12863862+Mayankm96@users.noreply.github.com>
Co-authored-by: 's avatarDavid Hoeller <dhoeller@ethz.ch>
parent e5b43e96
...@@ -4,7 +4,7 @@ omni.isaac.orbit_tasks.isaac_env ...@@ -4,7 +4,7 @@ omni.isaac.orbit_tasks.isaac_env
We use OpenAI Gym registry to register the environment and their default configuration file. We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry. The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`. :mod:`omni.isaac.orbit.utils.parse_cfg`.
...@@ -17,12 +17,12 @@ class. This is done using the function :meth:`load_default_env_cfg` in the sub-m ...@@ -17,12 +17,12 @@ class. This is done using the function :meth:`load_default_env_cfg` in the sub-m
.. code-block:: python .. code-block:: python
import gym import gymnasium as gym
import omni.isaac.orbit_tasks import omni.isaac.orbit_tasks
from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
task_name = "Isaac-Cartpole-v0" task_name = "Isaac-Cartpole-v0"
cfg = load_default_env_cfg(task_name) cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
env = gym.make(task_name, cfg=cfg) env = gym.make(task_name, cfg=cfg)
......
Known issues Known issues
============ ============
Installation errors due to gym==0.21.0
--------------------------------------
When installing the gym package, you may encounter the following error:
.. code-block::
error in gym setup command: 'extras_require' must be a dictionary whose values are strings or lists of
strings containing valid project/version requirement specifiers.
----------------------------------------
ERROR: Could not find a version that satisfies the requirement gym==0.21.0 (from omni-isaac-orbit-envs[all])
(from versions: 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6,
...
0.15.7, 0.16.0, 0.17.0, 0.17.1, 0.17.2, 0.17.3, 0.18.0, 0.18.3, 0.19.0, 0.20.0, 0.21.0, 0.22.0, 0.23.0,
0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.25.1, 0.25.2, 0.26.0, 0.26.1, 0.26.2)
ERROR: No matching distribution found for gym==0.21.0
This issue arises since the ``setuptools`` package from version 67.0 onwards does not support malformed version strings.
Since the OpenAI Gym package that is no longer being maintained (`issue link <https://github.com/openai/gym/issues/3200>`_),
the current workaround is to install the ``setuptools`` package version 66.0.0. You can do this by running the following
command:
.. code-block:: bash
./orbit.sh -p -m pip install -U setuptools==66
Regression in Isaac Sim 2022.2.1 Regression in Isaac Sim 2022.2.1
-------------------------------- --------------------------------
......
...@@ -157,7 +157,7 @@ utilities to manage extensions: ...@@ -157,7 +157,7 @@ utilities to manage extensions:
optional arguments: optional arguments:
-h, --help Display the help content. -h, --help Display the help content.
-i, --install Install the extensions inside Isaac Orbit. -i, --install Install the extensions inside Orbit.
-e, --extra Install extra dependencies such as the learning frameworks. -e, --extra Install extra dependencies such as the learning frameworks.
-f, --format Run pre-commit to format the code and check lints. -f, --format Run pre-commit to format the code and check lints.
-p, --python Run the python executable (python.sh) provided by Isaac Sim. -p, --python Run the python executable (python.sh) provided by Isaac Sim.
......
...@@ -141,7 +141,7 @@ format. ...@@ -141,7 +141,7 @@ format.
.. code:: bash .. code:: bash
# install python module (for robomimic) # install python module (for robomimic)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[robomimic]' ./orbit.sh -e robomimic
# split data # split data
./orbit.sh -p source/standalone//workflows/robomimic/tools/split_train_val.py logs/robomimic/Isaac-Lift-Franka-v0/hdf_dataset.hdf5 --ratio 0.2 ./orbit.sh -p source/standalone//workflows/robomimic/tools/split_train_val.py logs/robomimic/Isaac-Lift-Franka-v0/hdf_dataset.hdf5 --ratio 0.2
...@@ -171,7 +171,7 @@ from the environments into the respective libraries function argument and return ...@@ -171,7 +171,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash .. code:: bash
# install python module (for stable-baselines3) # install python module (for stable-baselines3)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[sb3]' ./orbit.sh -e sb3
# run script for training # run script for training
# note: we enable cpu flag since SB3 doesn't optimize for GPU anyway # note: we enable cpu flag since SB3 doesn't optimize for GPU anyway
./orbit.sh -p source/standalone/workflows/sb3/train.py --task Isaac-Cartpole-v0 --headless --cpu ./orbit.sh -p source/standalone/workflows/sb3/train.py --task Isaac-Cartpole-v0 --headless --cpu
...@@ -184,7 +184,7 @@ from the environments into the respective libraries function argument and return ...@@ -184,7 +184,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash .. code:: bash
# install python module (for skrl) # install python module (for skrl)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[skrl]' ./orbit.sh -e skrl
# run script for training # run script for training
./orbit.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless ./orbit.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless
# run script for playing with 32 environments # run script for playing with 32 environments
...@@ -196,7 +196,7 @@ from the environments into the respective libraries function argument and return ...@@ -196,7 +196,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash .. code:: bash
# install python module (for rl-games) # install python module (for rl-games)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rl_games]' ./orbit.sh -e rl_games
# run script for training # run script for training
./orbit.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-Ant-v0 --headless ./orbit.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-Ant-v0 --headless
# run script for playing with 32 environments # run script for playing with 32 environments
...@@ -208,7 +208,7 @@ from the environments into the respective libraries function argument and return ...@@ -208,7 +208,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash .. code:: bash
# install python module (for rsl-rl) # install python module (for rsl-rl)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rsl_rl]' ./orbit.sh -e rsl_rl
# run script for training # run script for training
./orbit.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Reach-Franka-v0 --headless ./orbit.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Reach-Franka-v0 --headless
# run script for playing with 32 environments # run script for playing with 32 environments
......
...@@ -39,11 +39,12 @@ an environment by calling ``gym.make``. The environments are registered in the ` ...@@ -39,11 +39,12 @@ an environment by calling ``gym.make``. The environments are registered in the `
gym.register( gym.register(
id="Isaac-Cartpole-v0", id="Isaac-Cartpole-v0",
entry_point="omni.isaac.orbit_tasks.classic.cartpole:CartpoleEnv", entry_point="omni.isaac.orbit_tasks.classic.cartpole:CartpoleEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"}, disable_env_checker=True,
kwargs={"env_cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"},
) )
The ``cfg_entry_point`` argument is used to load the default configuration for the environment. The default The ``env_cfg_entry_point`` argument is used to load the default configuration for the environment. The default
configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_default_env_cfg` function. configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_cfg_from_registry` function.
The configuration entry point can correspond to both a YAML file or a python configuration The configuration entry point can correspond to both a YAML file or a python configuration
class. The default configuration can be overridden by passing a custom configuration instance to the ``gym.make`` class. The default configuration can be overridden by passing a custom configuration instance to the ``gym.make``
function as shown later in the tutorial. function as shown later in the tutorial.
......
...@@ -26,13 +26,13 @@ For example, here is how you would wrap an environment to enforce that reset is ...@@ -26,13 +26,13 @@ For example, here is how you would wrap an environment to enforce that reset is
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import omni.isaac.orbit_tasks # noqa: F401 import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils import load_default_env_cfg from omni.isaac.orbit_tasks.utils import load_cfg_from_registry
# create base environment # create base environment
cfg = load_default_env_cfg("Isaac-Reach-Franka-v0") cfg = load_cfg_from_registry("Isaac-Reach-Franka-v0", "env_cfg_entry_point")
env = gym.make("Isaac-Reach-Franka-v0", cfg=cfg) env = gym.make("Isaac-Reach-Franka-v0", cfg=cfg)
# wrap environment to enforce that reset is called before step # wrap environment to enforce that reset is called before step
env = gym.wrappers.OrderEnforcing(env) env = gym.wrappers.OrderEnforcing(env)
...@@ -105,7 +105,7 @@ for 200 steps, and saves it in the ``videos`` folder at a step interval of 1500 ...@@ -105,7 +105,7 @@ for 200 steps, and saves it in the ``videos`` folder at a step interval of 1500
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
# adjust camera resolution and pose # adjust camera resolution and pose
env_cfg.viewer.resolution = (640, 480) env_cfg.viewer.resolution = (640, 480)
......
...@@ -185,7 +185,7 @@ print_help () { ...@@ -185,7 +185,7 @@ print_help () {
echo -e "\nusage: $(basename "$0") [-h] [-i] [-e] [-f] [-p] [-s] [-o] [-v] [-d] [-c] -- Utility to manage extensions in Orbit." echo -e "\nusage: $(basename "$0") [-h] [-i] [-e] [-f] [-p] [-s] [-o] [-v] [-d] [-c] -- Utility to manage extensions in Orbit."
echo -e "\noptional arguments:" echo -e "\noptional arguments:"
echo -e "\t-h, --help Display the help content." echo -e "\t-h, --help Display the help content."
echo -e "\t-i, --install Install the extensions inside Isaac Orbit." echo -e "\t-i, --install Install the extensions inside Orbit."
echo -e "\t-e, --extra Install extra dependencies such as the learning frameworks." echo -e "\t-e, --extra Install extra dependencies such as the learning frameworks."
echo -e "\t-f, --format Run pre-commit to format the code and check lints." echo -e "\t-f, --format Run pre-commit to format the code and check lints."
echo -e "\t-p, --python Run the python executable (python.sh) provided by Isaac Sim." echo -e "\t-p, --python Run the python executable (python.sh) provided by Isaac Sim."
...@@ -220,9 +220,6 @@ while [[ $# -gt 0 ]]; do ...@@ -220,9 +220,6 @@ while [[ $# -gt 0 ]]; do
# this does not check dependencies between extensions # this does not check dependencies between extensions
export -f extract_python_exe export -f extract_python_exe
export -f install_orbit_extension export -f install_orbit_extension
# downgrade setuptools to avoid issues with OpenAI Gym
# Check the `Known Issues` section in the documentation
$(extract_python_exe) -m pip install --upgrade setuptools==66
# source directory # source directory
find -L "${ORBIT_PATH}/source/extensions" -mindepth 1 -maxdepth 1 -type d -exec bash -c 'install_orbit_extension "{}"' \; find -L "${ORBIT_PATH}/source/extensions" -mindepth 1 -maxdepth 1 -type d -exec bash -c 'install_orbit_extension "{}"' \;
# unset local variables # unset local variables
...@@ -235,8 +232,17 @@ while [[ $# -gt 0 ]]; do ...@@ -235,8 +232,17 @@ while [[ $# -gt 0 ]]; do
# install the python packages for supported reinforcement learning frameworks # install the python packages for supported reinforcement learning frameworks
echo "[INFO] Installing extra requirements such as learning frameworks..." echo "[INFO] Installing extra requirements such as learning frameworks..."
python_exe=$(extract_python_exe) python_exe=$(extract_python_exe)
# check if specified which rl-framework to install
if [ -z "$2" ]; then
echo "[INFO] Installing all rl-frameworks..."
framework_name="all"
else
echo "[INFO] Installing rl-framework: $2"
framework_name=$2
shift # past argument
fi
# install the rl-frameworks specified # install the rl-frameworks specified
${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks[all] ${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks["${framework_name}"]
shift # past argument shift # past argument
;; ;;
-c|--conda) -c|--conda)
......
...@@ -27,7 +27,7 @@ extra_standard_library = [ ...@@ -27,7 +27,7 @@ extra_standard_library = [
"tensordict", "tensordict",
"bpy", "bpy",
"matplotlib", "matplotlib",
"gym", "gymnasium",
"scipy", "scipy",
"hid", "hid",
"yaml", "yaml",
......
...@@ -18,9 +18,12 @@ itself. However, its various instances should be included in directories within ...@@ -18,9 +18,12 @@ itself. However, its various instances should be included in directories within
The environments should then be registered in the `omni/isaac/contrib_tasks/__init__.py`: The environments should then be registered in the `omni/isaac/contrib_tasks/__init__.py`:
```python ```python
import gymnasium as gym
gym.register( gym.register(
id="Isaac-Contrib-<my-awesome-env>-v0", id="Isaac-Contrib-<my-awesome-env>-v0",
entry_point="omni.isaac.contrib_tasks.<your-env-package>:<your-env-class>", entry_point="omni.isaac.contrib_tasks.<your-env-package>:<your-env-class>",
disable_env_checker=True,
kwargs={"cfg_entry_point": "omni.isaac.contrib_tasks.<your-env-package-cfg>:<your-env-class-cfg>"}, kwargs={"cfg_entry_point": "omni.isaac.contrib_tasks.<your-env-package-cfg>:<your-env-class-cfg>"},
) )
``` ```
...@@ -9,7 +9,7 @@ ...@@ -9,7 +9,7 @@
We use OpenAI Gym registry to register the environment and their default configuration file. We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry. The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`. :mod:`omni.isaac.orbit.utils.parse_cfg`.
Note: Note:
...@@ -18,18 +18,18 @@ Note: ...@@ -18,18 +18,18 @@ Note:
the kwarg argument :obj:`cfg` while creating the environment. the kwarg argument :obj:`cfg` while creating the environment.
Usage: Usage:
>>> import gym >>> import gymnasium as gym
>>> import omni.isaac.contrib_tasks >>> import omni.isaac.contrib_tasks
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
>>> >>>
>>> task_name = "Isaac-Contrib-<my-registered-env-name>-v0" >>> task_name = "Isaac-Contrib-<my-registered-env-name>-v0"
>>> cfg = load_default_env_cfg(task_name) >>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
>>> env = gym.make(task_name, cfg=cfg) >>> env = gym.make(task_name, cfg=cfg)
""" """
from __future__ import annotations from __future__ import annotations
import gym # noqa: F401 import gymnasium as gym # noqa: F401
import os import os
import toml import toml
......
...@@ -28,6 +28,10 @@ setup( ...@@ -28,6 +28,10 @@ setup(
include_package_data=True, include_package_data=True,
python_requires=">=3.7", python_requires=">=3.7",
packages=["omni.isaac.contrib_tasks"], packages=["omni.isaac.contrib_tasks"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"], classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False, zip_safe=False,
) )
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.9.37" version = "0.9.38"
# Description # Description
title = "ORBIT framework for Robot Learning" title = "ORBIT framework for Robot Learning"
......
Changelog Changelog
--------- ---------
0.9.38 (2023-11-07)
~~~~~~~~~~~~~~~~~~~
Changed
^^^^^^^
* Upgraded the :class:`omni.isaac.orbit.envs.RLTaskEnv` class to support Gym 0.29.0 environment definition.
Added
^^^^^
* Added computation of ``time_outs`` and ``terminated`` signals inside the termination manager. These follow the
definition mentioned in `Gym 0.29.0 <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_.
* Added proper handling of observation and action spaces in the :class:`omni.isaac.orbit.envs.RLTaskEnv` class.
These now follow closely to how Gym VecEnv handles the spaces.
0.9.37 (2023-11-06) 0.9.37 (2023-11-06)
~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~
......
...@@ -91,6 +91,11 @@ class ObservationManager(ManagerBase): ...@@ -91,6 +91,11 @@ class ObservationManager(ManagerBase):
"""Shape of observation tensor for each term in each group.""" """Shape of observation tensor for each term in each group."""
return self._group_obs_term_dim return self._group_obs_term_dim
@property
def group_obs_concatenate(self) -> dict[str, bool]:
"""Whether the observation terms are concatenated in each group."""
return self._group_obs_concatenate
""" """
Operations. Operations.
""" """
......
...@@ -26,8 +26,20 @@ class TerminationManager(ManagerBase): ...@@ -26,8 +26,20 @@ class TerminationManager(ManagerBase):
argument and returns a boolean tensor of shape ``(num_envs,)``. The termination manager argument and returns a boolean tensor of shape ``(num_envs,)``. The termination manager
computes the termination signal as the union (logical or) of all the termination terms. computes the termination signal as the union (logical or) of all the termination terms.
Following the `Gymnasium API <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_,
the termination signal is computed as the logical OR of the following signals:
* **Time-out**: This signal is set to true if the environment has ended after an externally defined condition
(that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
timed out (i.e. reached max episode length).
* **Terminated**: This signal is set to true if the environment has reached a terminal state defined by the
environment. This state may correspond to task success, task failure, robot falling, etc.
These signals can be individually accessed using the :attr:`time_outs` and :attr:`terminated` properties.
The termination terms are parsed from a config class containing the manager's settings and each term's The termination terms are parsed from a config class containing the manager's settings and each term's
parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class. parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class. The term's
configuration :attr:`TerminationTermCfg.time_out` decides whether the term is a timeout or a termination term.
""" """
_env: RLTaskEnv _env: RLTaskEnv
...@@ -46,8 +58,8 @@ class TerminationManager(ManagerBase): ...@@ -46,8 +58,8 @@ class TerminationManager(ManagerBase):
for term_name in self._term_names: for term_name in self._term_names:
self._episode_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool) self._episode_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
# create buffer for managing termination per environment # create buffer for managing termination per environment
self._done_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool) self._truncated_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
self._time_out_buf = torch.zeros_like(self._done_buf) self._terminated_buf = torch.zeros_like(self._truncated_buf)
def __str__(self) -> str: def __str__(self) -> str:
"""Returns: A string representation for termination manager.""" """Returns: A string representation for termination manager."""
...@@ -79,12 +91,26 @@ class TerminationManager(ManagerBase): ...@@ -79,12 +91,26 @@ class TerminationManager(ManagerBase):
@property @property
def dones(self) -> torch.Tensor: def dones(self) -> torch.Tensor:
"""The net termination signal. Shape is ``(num_envs,)``.""" """The net termination signal. Shape is ``(num_envs,)``."""
return self._done_buf return self._truncated_buf | self._terminated_buf
@property @property
def time_outs(self) -> torch.Tensor: def time_outs(self) -> torch.Tensor:
"""The timeout signal. Shape is ``(num_envs,)``.""" """The timeout signal (reaching max episode length). Shape is ``(num_envs,)``.
return self._time_out_buf
This signal is set to true if the environment has ended after an externally defined condition
(that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
timed out (i.e. reached max episode length).
"""
return self._truncated_buf
@property
def terminated(self) -> torch.Tensor:
"""The terminated signal (reaching a terminal state). Shape is ``(num_envs,)``.
This signal is set to true if the environment has reached a terminal state defined by the environment.
This state may correspond to task success, task failure, robot falling, etc.
"""
return self._terminated_buf
""" """
Operations. Operations.
...@@ -122,20 +148,20 @@ class TerminationManager(ManagerBase): ...@@ -122,20 +148,20 @@ class TerminationManager(ManagerBase):
The combined termination signal of shape ``(num_envs,)``. The combined termination signal of shape ``(num_envs,)``.
""" """
# reset computation # reset computation
self._done_buf[:] = False self._truncated_buf[:] = False
self._time_out_buf[:] = False self._terminated_buf[:] = False
# iterate over all the termination terms # iterate over all the termination terms
for name, term_cfg in zip(self._term_names, self._term_cfgs): for name, term_cfg in zip(self._term_names, self._term_cfgs):
value = term_cfg.func(self._env, **term_cfg.params) value = term_cfg.func(self._env, **term_cfg.params)
# update total termination
self._done_buf |= value
# store timeout signal separately # store timeout signal separately
if term_cfg.time_out: if term_cfg.time_out:
self._time_out_buf |= value self._truncated_buf |= value
else:
self._terminated_buf |= value
# add to episode dones # add to episode dones
self._episode_dones[name] |= value self._episode_dones[name] |= value
# return termination signal # return combined termination signal
return self._done_buf return self._truncated_buf | self._terminated_buf
""" """
Operations - Term settings. Operations - Term settings.
......
...@@ -292,13 +292,13 @@ class SimulationContext(_SimulationContext): ...@@ -292,13 +292,13 @@ class SimulationContext(_SimulationContext):
# hide the viewport and disable updates # hide the viewport and disable updates
self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess] self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess]
self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess] self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess]
# reset the throttle counter
self._render_throttle_counter = 0
elif mode == self.RenderMode.NO_RENDERING: elif mode == self.RenderMode.NO_RENDERING:
# hide the viewport and disable updates # hide the viewport and disable updates
if self._viewport_context is not None: if self._viewport_context is not None:
self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess] self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess]
self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess] self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess]
# reset the throttle counter
self._render_throttle_counter = 0
else: else:
raise ValueError(f"Unsupported render mode: {mode}! Please check `RenderMode` for details.") raise ValueError(f"Unsupported render mode: {mode}! Please check `RenderMode` for details.")
# update render mode # update render mode
...@@ -403,14 +403,21 @@ class SimulationContext(_SimulationContext): ...@@ -403,14 +403,21 @@ class SimulationContext(_SimulationContext):
self._render_throttle_counter += 1 self._render_throttle_counter += 1
if self._render_throttle_counter % self._render_throttle_period == 0: if self._render_throttle_counter % self._render_throttle_period == 0:
self._render_throttle_counter = 0 self._render_throttle_counter = 0
# here we don't render viewport so don't need to flush flatcache # here we don't render viewport so don't need to flush fabric data
super().render() # note: we don't call super().render() anymore because they do flush the fabric data
self.set_setting("/app/player/playSimulations", False)
self._app.update()
self.set_setting("/app/player/playSimulations", True)
else: else:
# manually flush the flatcache data to update Hydra textures # manually flush the fabric data to update Hydra textures
if self._fabric_iface is not None: if self._fabric_iface is not None:
self._fabric_iface.update(0.0, 0.0) self._fabric_iface.update(0.0, 0.0)
# render the simulation # render the simulation
super().render() # note: we don't call super().render() anymore because they do above operation inside
# and we don't want to do it twice. We may remove it once we drop support for Isaac Sim 2022.2.
self.set_setting("/app/player/playSimulations", False)
self._app.update()
self.set_setting("/app/player/playSimulations", True)
""" """
Operations - Override (extension) Operations - Override (extension)
......
...@@ -25,18 +25,17 @@ INSTALL_REQUIRES = [ ...@@ -25,18 +25,17 @@ INSTALL_REQUIRES = [
# devices # devices
"hidapi", "hidapi",
# gym # gym
"gym==0.21.0", "gymnasium==0.29.0",
"importlib-metadata~=4.13.0",
"setuptools<=66", # setuptools 67.0 breaks gym
# procedural-generation # procedural-generation
"trimesh", "trimesh",
"pyglet==1.5.27", # pyglet 2.0 requires python 3.8 "pyglet==1.5.27; python_version < '3.8'", # pyglet 2.0 requires python 3.8
"pyglet; python_version >= '3.8'",
] ]
# Installation operation # Installation operation
setup( setup(
name="omni-isaac-orbit", name="omni-isaac-orbit",
author="NVIDIA, ETH Zurich, and University of Toronto", author="ORBIT Project Developers",
maintainer="Mayank Mittal", maintainer="Mayank Mittal",
maintainer_email="mittalma@ethz.ch", maintainer_email="mittalma@ethz.ch",
url=EXTENSION_TOML_DATA["package"]["repository"], url=EXTENSION_TOML_DATA["package"]["repository"],
...@@ -48,6 +47,10 @@ setup( ...@@ -48,6 +47,10 @@ setup(
python_requires=">=3.7", python_requires=">=3.7",
install_requires=INSTALL_REQUIRES, install_requires=INSTALL_REQUIRES,
packages=["omni.isaac.orbit"], packages=["omni.isaac.orbit"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"], classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False, zip_safe=False,
) )
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
from __future__ import annotations from __future__ import annotations
import torch import torch
import torch.utils.benchmark as benchmark
import unittest import unittest
...@@ -124,6 +125,30 @@ class TestTorchOperations(unittest.TestCase): ...@@ -124,6 +125,30 @@ class TestTorchOperations(unittest.TestCase):
my_slice = my_tensor[torch.tensor([0, 1]), ...] my_slice = my_tensor[torch.tensor([0, 1]), ...]
self.assertNotEqual(my_slice.untyped_storage().data_ptr(), my_tensor.untyped_storage().data_ptr()) self.assertNotEqual(my_slice.untyped_storage().data_ptr(), my_tensor.untyped_storage().data_ptr())
def test_logical_or(self):
"""Test bitwise or operation."""
size = (400, 300, 5)
my_tensor_1 = torch.rand(size, device="cuda:0") > 0.5
my_tensor_2 = torch.rand(size, device="cuda:0") < 0.5
# check the speed of logical or
timer_logical_or = benchmark.Timer(
stmt="torch.logical_or(my_tensor_1, my_tensor_2)",
globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2},
)
timer_bitwise_or = benchmark.Timer(
stmt="my_tensor_1 | my_tensor_2", globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2}
)
print("Time for logical or:", timer_logical_or.timeit(number=1000))
print("Time for bitwise or:", timer_bitwise_or.timeit(number=1000))
# check that logical or works as expected
output_logical_or = torch.logical_or(my_tensor_1, my_tensor_2)
output_bitwise_or = my_tensor_1 | my_tensor_2
self.assertTrue(torch.allclose(output_logical_or, output_bitwise_or))
if __name__ == "__main__": if __name__ == "__main__":
unittest.main() unittest.main()
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.5.0" version = "0.5.1"
# Description # Description
title = "ORBIT Environments" title = "ORBIT Environments"
......
Changelog Changelog
--------- ---------
0.5.1 (2023-11-04)
~~~~~~~~~~~~~~~~~~
Fixed
^^^^^
* Fixed the wrappers to different learning frameworks to use the new :class:`omni.isaac.orbit_tasks.RLTaskEnv` class.
The :class:`RLTaskEnv` class inherits from the :class:`gymnasium.Env` class (Gym 0.29.0).
* Fixed the registration of tasks in the Gym registry based on Gym 0.29.0 API.
Changed
^^^^^^^
* Removed the inheritance of all the RL-framework specific wrappers from the :class:`gymnasium.Wrapper` class.
This is because the wrappers don't comply with the new Gym 0.29.0 API. The wrappers are now only inherit
from their respective RL-framework specific base classes.
0.5.0 (2023-10-30) 0.5.0 (2023-10-30)
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
......
...@@ -17,28 +17,31 @@ This looks like as follows: ...@@ -17,28 +17,31 @@ This looks like as follows:
omni/isaac/orbit_tasks/locomotion/ omni/isaac/orbit_tasks/locomotion/
├── __init__.py ├── __init__.py
└── velocity └── velocity
├── a1 ├── config
│ └── flat_terrain_cfg.py │ └── anymal_c
├── anymal_c │ ├── agent # <- this is where we store the learning agent configurations
│ └── flat_terrain_cfg.py │ ├── __init__.py # <- this is where we register the environment and configurations to gym registry
│ ├── flat_env_cfg.py
│ └── rough_env_cfg.py
├── __init__.py ├── __init__.py
├── velocity_cfg.py └── velocity_env_cfg.py # <- this is the base task configuration
└── velocity_env.py
``` ```
The environments are then registered in the `omni/isaac/orbit_tasks/__init__.py`: The environments are then registered in the `omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_c/__init__.py`:
```python ```python
gym.register( gym.register(
id="Isaac-Velocity-Anymal-C-v0", id="Isaac-Velocity-Rough-Anymal-C-v0",
entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.anymal_c.flat_terrain_cfg:FlatTerrainCfg"}, disable_env_checker=True,
kwargs={"env_cfg_entry_point": f"{__name__}.rough_env_cfg:AnymalCRoughEnvCfg"},
) )
gym.register( gym.register(
id="Isaac-Velocity-A1-v0", id="Isaac-Velocity-Flat-Anymal-C-v0",
entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.a1.flat_terrain_cfg:FlatTerrainCfg"}, disable_env_checker=True,
kwargs={"env_cfg_entry_point": f"{__name__}.flat_env_cfg:AnymalCFlatEnvCfg"},
) )
``` ```
......
...@@ -9,7 +9,7 @@ ...@@ -9,7 +9,7 @@
We use OpenAI Gym registry to register the environment and their default configuration file. We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry. The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`. :mod:`omni.isaac.orbit.utils.parse_cfg`.
Note: Note:
...@@ -18,12 +18,12 @@ Note: ...@@ -18,12 +18,12 @@ Note:
the kwarg argument :obj:`cfg` while creating the environment. the kwarg argument :obj:`cfg` while creating the environment.
Usage: Usage:
>>> import gym >>> import gymnasium as gym
>>> import omni.isaac.orbit_tasks >>> import omni.isaac.orbit_tasks
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
>>> >>>
>>> task_name = "Isaac-Cartpole-v0" >>> task_name = "Isaac-Cartpole-v0"
>>> cfg = load_default_env_cfg(task_name) >>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
>>> env = gym.make(task_name, cfg=cfg) >>> env = gym.make(task_name, cfg=cfg)
""" """
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
Ant locomotion environment (similar to OpenAI Gym Ant-v2). Ant locomotion environment (similar to OpenAI Gym Ant-v2).
""" """
import gym import gymnasium as gym
from . import agents from . import agents
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
from __future__ import annotations from __future__ import annotations
import gym.spaces import gymnasium as gym
import math import math
import torch import torch
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
Cartpole balancing environment. Cartpole balancing environment.
""" """
import gym import gymnasium as gym
from . import agents from . import agents
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
from __future__ import annotations from __future__ import annotations
import gym.spaces import gymnasium as gym
import math import math
import torch import torch
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
Humanoid locomotion environment (similar to OpenAI Gym Humanoid-v2). Humanoid locomotion environment (similar to OpenAI Gym Humanoid-v2).
""" """
import gym import gymnasium as gym
from . import agents from . import agents
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
from __future__ import annotations from __future__ import annotations
import gym.spaces import gymnasium as gym
import math import math
import torch import torch
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import gym import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg from . import agents, flat_env_cfg, rough_env_cfg
...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg ...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-B-v0", id="Isaac-Velocity-Flat-Anymal-B-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg, "env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
...@@ -23,6 +24,7 @@ gym.register( ...@@ -23,6 +24,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-B-Play-v0", id="Isaac-Velocity-Flat-Anymal-B-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg_PLAY, "env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
...@@ -32,6 +34,7 @@ gym.register( ...@@ -32,6 +34,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-B-v0", id="Isaac-Velocity-Rough-Anymal-B-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg, "env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,
...@@ -41,6 +44,7 @@ gym.register( ...@@ -41,6 +44,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-B-Play-v0", id="Isaac-Velocity-Rough-Anymal-B-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg_PLAY, "env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import gym import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg from . import agents, flat_env_cfg, rough_env_cfg
...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg ...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-C-v0", id="Isaac-Velocity-Flat-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg, "env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
...@@ -24,6 +25,7 @@ gym.register( ...@@ -24,6 +25,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-C-Play-v0", id="Isaac-Velocity-Flat-Anymal-C-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg_PLAY, "env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
...@@ -33,6 +35,7 @@ gym.register( ...@@ -33,6 +35,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-C-v0", id="Isaac-Velocity-Rough-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg, "env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,
...@@ -42,6 +45,7 @@ gym.register( ...@@ -42,6 +45,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-C-Play-v0", id="Isaac-Velocity-Rough-Anymal-C-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg_PLAY, "env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import gym import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg from . import agents, flat_env_cfg, rough_env_cfg
...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg ...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-D-v0", id="Isaac-Velocity-Flat-Anymal-D-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg, "env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
...@@ -23,6 +24,7 @@ gym.register( ...@@ -23,6 +24,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Flat-Anymal-D-Play-v0", id="Isaac-Velocity-Flat-Anymal-D-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg_PLAY, "env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
...@@ -32,6 +34,7 @@ gym.register( ...@@ -32,6 +34,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-D-v0", id="Isaac-Velocity-Rough-Anymal-D-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg, "env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,
...@@ -41,6 +44,7 @@ gym.register( ...@@ -41,6 +44,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Anymal-D-Play-v0", id="Isaac-Velocity-Rough-Anymal-D-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg_PLAY, "env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import gym import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg from . import agents, flat_env_cfg, rough_env_cfg
...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg ...@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register( gym.register(
id="Isaac-Velocity-Flat-Unitree-A1-v0", id="Isaac-Velocity-Flat-Unitree-A1-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg, "env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
...@@ -23,6 +24,7 @@ gym.register( ...@@ -23,6 +24,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Flat-Unitree-A1-Play-v0", id="Isaac-Velocity-Flat-Unitree-A1-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg_PLAY, "env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
...@@ -32,6 +34,7 @@ gym.register( ...@@ -32,6 +34,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Unitree-A1-v0", id="Isaac-Velocity-Rough-Unitree-A1-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg, "env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,
...@@ -41,6 +44,7 @@ gym.register( ...@@ -41,6 +44,7 @@ gym.register(
gym.register( gym.register(
id="Isaac-Velocity-Rough-Unitree-A1-Play-v0", id="Isaac-Velocity-Rough-Unitree-A1-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg_PLAY, "env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg, "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,
......
...@@ -65,7 +65,7 @@ class MySceneCfg(InteractiveSceneCfg): ...@@ -65,7 +65,7 @@ class MySceneCfg(InteractiveSceneCfg):
offset=RayCasterCfg.OffsetCfg(pos=(0.0, 0.0, 20.0)), offset=RayCasterCfg.OffsetCfg(pos=(0.0, 0.0, 20.0)),
attach_yaw_only=True, attach_yaw_only=True,
pattern_cfg=patterns.GridPatternCfg(resolution=0.1, size=[1.6, 1.0]), pattern_cfg=patterns.GridPatternCfg(resolution=0.1, size=[1.6, 1.0]),
debug_vis=True, debug_vis=False,
mesh_prim_paths=["/World/ground"], mesh_prim_paths=["/World/ground"],
) )
contact_forces = ContactSensorCfg(prim_path="{ENV_REGEX_NS}/Robot/.*", history_length=3, track_air_time=True) contact_forces = ContactSensorCfg(prim_path="{ENV_REGEX_NS}/Robot/.*", history_length=3, track_air_time=True)
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
Environment for lifting an object with fixed-base robot. Environment for lifting an object with fixed-base robot.
""" """
import gym import gymnasium as gym
from . import agents from . import agents
...@@ -18,6 +18,7 @@ from . import agents ...@@ -18,6 +18,7 @@ from . import agents
gym.register( gym.register(
id="Isaac-Lift-Franka-v0", id="Isaac-Lift-Franka-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": f"{__name__}.lift_env_cfg:LiftEnvCfg", "env_cfg_entry_point": f"{__name__}.lift_env_cfg:LiftEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml", "rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
from __future__ import annotations from __future__ import annotations
import gym.spaces import gymnasium as gym
import math import math
import torch import torch
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
"""Environment for end-effector pose tracking task for fixed-arm robots.""" """Environment for end-effector pose tracking task for fixed-arm robots."""
import gym import gymnasium as gym
from . import agents from . import agents
...@@ -16,6 +16,7 @@ from . import agents ...@@ -16,6 +16,7 @@ from . import agents
gym.register( gym.register(
id="Isaac-Reach-Franka-v0", id="Isaac-Reach-Franka-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv", entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={ kwargs={
"env_cfg_entry_point": f"{__name__}.reach_env_cfg:ReachEnvCfg", "env_cfg_entry_point": f"{__name__}.reach_env_cfg:ReachEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml", "rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
......
...@@ -5,7 +5,7 @@ ...@@ -5,7 +5,7 @@
from __future__ import annotations from __future__ import annotations
import gym.spaces import gymnasium as gym
import math import math
import torch import torch
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
from __future__ import annotations from __future__ import annotations
import gym import gymnasium as gym
import importlib import importlib
import inspect import inspect
import os import os
...@@ -52,7 +52,7 @@ def load_cfg_from_registry(task_name: str, entry_point_key: str) -> dict | Any: ...@@ -52,7 +52,7 @@ def load_cfg_from_registry(task_name: str, entry_point_key: str) -> dict | Any:
ValueError: If the entry point key is not available in the gym registry for the task. ValueError: If the entry point key is not available in the gym registry for the task.
""" """
# obtain the configuration entry point # obtain the configuration entry point
cfg_entry_point = gym.spec(task_name)._kwargs.pop(entry_point_key) cfg_entry_point = gym.spec(task_name).kwargs.pop(entry_point_key)
# check if entry point exists # check if entry point exists
if cfg_entry_point is None: if cfg_entry_point is None:
raise ValueError( raise ValueError(
......
...@@ -33,7 +33,7 @@ for RL-Games :class:`Runner` class: ...@@ -33,7 +33,7 @@ for RL-Games :class:`Runner` class:
from __future__ import annotations from __future__ import annotations
import gym import gymnasium as gym
import torch import torch
from rl_games.common import env_configurations from rl_games.common import env_configurations
...@@ -49,10 +49,10 @@ Vectorized environment wrapper. ...@@ -49,10 +49,10 @@ Vectorized environment wrapper.
""" """
class RlGamesVecEnvWrapper(gym.Wrapper): class RlGamesVecEnvWrapper(IVecEnv):
"""Wraps around Isaac Orbit environment for RL-Games. """Wraps around Orbit environment for RL-Games.
This class wraps around the Isaac Orbit environment. Since RL-Games works directly on This class wraps around the Orbit environment. Since RL-Games works directly on
GPU buffers, the wrapper handles moving of buffers from the simulation environment GPU buffers, the wrapper handles moving of buffers from the simulation environment
to the same device as the learning agent. Additionally, it performs clipping of to the same device as the learning agent. Additionally, it performs clipping of
observations and actions. observations and actions.
...@@ -69,6 +69,13 @@ class RlGamesVecEnvWrapper(gym.Wrapper): ...@@ -69,6 +69,13 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
checks if these attributes exist. If they don't then the wrapper defaults to zero as number checks if these attributes exist. If they don't then the wrapper defaults to zero as number
of privileged observations. of privileged observations.
.. caution::
This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
wrapper.
Reference: Reference:
https://github.com/Denys88/rl_games/blob/master/rl_games/common/ivecenv.py https://github.com/Denys88/rl_games/blob/master/rl_games/common/ivecenv.py
https://github.com/NVIDIA-Omniverse/IsaacGymEnvs https://github.com/NVIDIA-Omniverse/IsaacGymEnvs
...@@ -85,30 +92,77 @@ class RlGamesVecEnvWrapper(gym.Wrapper): ...@@ -85,30 +92,77 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
Raises: Raises:
ValueError: The environment is not inherited from :class:`RLTaskEnv`. ValueError: The environment is not inherited from :class:`RLTaskEnv`.
ValueError: If specified, the privileged observations (critic) are not of type :obj:`gym.spaces.Box`.
""" """
# check that input is valid # check that input is valid
if not isinstance(env.unwrapped, RLTaskEnv): if not isinstance(env.unwrapped, RLTaskEnv):
raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}") raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
# initialize gym wrapper # initialize the wrapper
gym.Wrapper.__init__(self, env) self.env = env
# initialize rl-games vec-env
IVecEnv.__init__(self)
# store provided arguments # store provided arguments
self._rl_device = rl_device self._rl_device = rl_device
self._clip_obs = clip_obs self._clip_obs = clip_obs
self._clip_actions = clip_actions self._clip_actions = clip_actions
self._sim_device = env.unwrapped.device
# information about spaces for the wrapper # information about spaces for the wrapper
self.observation_space = self.env.observation_space # note: rl-games only wants single observation and action spaces
self.action_space = self.env.action_space self.rlg_observation_space = self.unwrapped.single_observation_space["policy"]
self.rlg_action_space = self.unwrapped.single_action_space
# information for privileged observations # information for privileged observations
self.state_space = getattr(self.env, "state_space", None) self.rlg_state_space = self.unwrapped.single_observation_space.get("critic")
self.num_states = getattr(self.env, "num_states", 0) if self.rlg_state_space is not None:
# print information about wrapper if not isinstance(self.rlg_state_space, gym.spaces.Box):
print("[INFO]: RL-Games Environment Wrapper:") raise ValueError(f"Privileged observations must be of type Box. Type: {type(self.rlg_state_space)}")
print(f"\t\t Observations clipping: {clip_obs}") self.rlg_num_states = self.rlg_state_space.shape[0]
print(f"\t\t Actions clipping : {clip_actions}") else:
print(f"\t\t Agent device : {rl_device}") self.rlg_num_states = 0
print(f"\t\t Asymmetric-learning : {self.num_states != 0}")
def __str__(self):
"""Returns the wrapper name and the :attr:`env` representation string."""
return (
f"<{type(self).__name__}{self.env}>"
f"\n\tObservations clipping: {self._clip_obs}"
f"\n\tActions clipping : {self._clip_actions}"
f"\n\tAgent device : {self._rl_device}"
f"\n\tAsymmetric-learning : {self.rlg_num_states != 0}"
)
def __repr__(self):
"""Returns the string representation of the wrapper."""
return str(self)
"""
Properties -- Gym.Wrapper
"""
@property
def render_mode(self) -> str | None:
"""Returns the :attr:`Env` :attr:`render_mode`."""
return self.env.render_mode
@property
def observation_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`observation_space`."""
return self.env.observation_space
@property
def action_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`action_space`."""
return self.env.action_space
@classmethod
def class_name(cls) -> str:
"""Returns the class name of the wrapper."""
return cls.__name__
@property
def unwrapped(self) -> RLTaskEnv:
"""Returns the base environment of the wrapper.
This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
"""
return self.env.unwrapped
""" """
Properties Properties
...@@ -120,40 +174,46 @@ class RlGamesVecEnvWrapper(gym.Wrapper): ...@@ -120,40 +174,46 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
def get_env_info(self) -> dict: def get_env_info(self) -> dict:
"""Returns the Gym spaces for the environment.""" """Returns the Gym spaces for the environment."""
# fill the env info dict return {
env_info = {"observation_space": self.observation_space, "action_space": self.action_space} "observation_space": self.rlg_observation_space,
# add information about privileged observations space "action_space": self.rlg_action_space,
if self.num_states > 0: "state_space": self.rlg_state_space,
env_info["state_space"] = self.state_space }
return env_info
""" """
Operations - MDP Operations - MDP
""" """
def seed(self, seed: int = -1) -> int: # noqa: D102
return self.unwrapped.seed(seed)
def reset(self): # noqa: D102 def reset(self): # noqa: D102
obs_dict = self.env.reset() obs_dict, _ = self.env.reset()
# process observations and states # process observations and states
return self._process_obs(obs_dict) return self._process_obs(obs_dict)
def step(self, actions): # noqa: D102 def step(self, actions): # noqa: D102
# move actions to sim-device
actions = actions.detach().clone().to(device=self._sim_device)
# clip the actions # clip the actions
actions = torch.clamp(actions.clone(), -self._clip_actions, self._clip_actions) actions = torch.clamp(actions, -self._clip_actions, self._clip_actions)
# perform environment step # perform environment step
obs_dict, rew, dones, extras = self.env.step(actions) obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
# process observations and states # process observations and states
obs_and_states = self._process_obs(obs_dict) obs_and_states = self._process_obs(obs_dict)
# move buffers to rl-device # move buffers to rl-device
# note: we perform clone to prevent issues when rl-device and sim-device are the same. # note: we perform clone to prevent issues when rl-device and sim-device are the same.
rew = rew.to(self._rl_device) rew = rew.to(device=self._rl_device)
dones = dones.to(self._rl_device) dones = (terminated | truncated).to(device=self._rl_device)
extras = { extras = {
k: v.to(device=self._rl_device, non_blocking=True) if hasattr(v, "to") else v for k, v in extras.items() k: v.to(device=self._rl_device, non_blocking=True) if hasattr(v, "to") else v for k, v in extras.items()
} }
return obs_and_states, rew, dones, extras return obs_and_states, rew, dones, extras
def close(self): # noqa: D102
return self.env.close()
""" """
Helper functions Helper functions
""" """
...@@ -163,34 +223,29 @@ class RlGamesVecEnvWrapper(gym.Wrapper): ...@@ -163,34 +223,29 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
Note: Note:
States typically refers to privileged observations for the critic function. It is typically used in States typically refers to privileged observations for the critic function. It is typically used in
asymmetric actor-critic algorithms [1]. asymmetric actor-critic algorithms.
Args: Args:
obs: The current observations from environment. obs_dict: The current observations from environment.
Returns: Returns:
If environment provides states, then a dictionary If environment provides states, then a dictionary containing the observations and states is returned.
containing the observations and states is returned. Otherwise just the observations tensor Otherwise just the observations tensor is returned.
is returned.
Reference:
1. Pinto, Lerrel, et al. "Asymmetric actor critic for image-based robot learning."
arXiv preprint arXiv:1710.06542 (2017).
""" """
# process policy obs # process policy obs
obs = obs_dict["policy"] obs = obs_dict["policy"]
# clip the observations # clip the observations
obs = torch.clamp(obs, -self._clip_obs, self._clip_obs) obs = torch.clamp(obs, -self._clip_obs, self._clip_obs)
# move the buffer to rl-device # move the buffer to rl-device
obs = obs.to(self._rl_device).clone() obs = obs.to(device=self._rl_device).clone()
# check if asymmetric actor-critic or not # check if asymmetric actor-critic or not
if self.num_states > 0: if self.rlg_num_states > 0:
# acquire states from the environment if it exists # acquire states from the environment if it exists
try: try:
states = obs_dict["critic"] states = obs_dict["critic"]
except AttributeError: except AttributeError:
raise NotImplementedError("Environment does not define key `critic` for privileged observations.") raise NotImplementedError("Environment does not define key 'critic' for privileged observations.")
# clip the states # clip the states
states = torch.clamp(states, -self._clip_obs, self._clip_obs) states = torch.clamp(states, -self._clip_obs, self._clip_obs)
# move buffers to rl-device # move buffers to rl-device
......
...@@ -17,22 +17,28 @@ The following example shows how to wrap an environment for RSL-RL: ...@@ -17,22 +17,28 @@ The following example shows how to wrap an environment for RSL-RL:
from __future__ import annotations from __future__ import annotations
import gym import gymnasium as gym
import gym.spaces
import torch import torch
from rsl_rl.env import VecEnv
from omni.isaac.orbit.envs import RLTaskEnv from omni.isaac.orbit.envs import RLTaskEnv
class RslRlVecEnvWrapper(gym.Wrapper): class RslRlVecEnvWrapper(VecEnv):
"""Wraps around Isaac Orbit environment for RSL-RL library """Wraps around Orbit environment for RSL-RL library
To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_privileged_obs` (int).
This is used by the learning agent to allocate buffers in the trajectory memory. Additionally, the returned
observations should have the key "critic" which corresponds to the privileged observations. Since this is
optional for some environments, the wrapper checks if these attributes exist. If they don't then the wrapper
defaults to zero as number of privileged observations.
.. caution::
To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_states` (int) This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
and :attr:`state_space` (:obj:`gym.spaces.Box`). These are used by the learning agent to allocate buffers in the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
the trajectory memory. Additionally, the method :meth:`_get_observations()` should have the key "critic" wrapper.
which corresponds to the privileged observations. Since this is optional for some environments, the wrapper
checks if these attributes exist. If they don't then the wrapper defaults to zero as number of privileged
observations.
Reference: Reference:
https://github.com/leggedrobotics/rsl_rl/blob/master/rsl_rl/env/vec_env.py https://github.com/leggedrobotics/rsl_rl/blob/master/rsl_rl/env/vec_env.py
...@@ -41,6 +47,9 @@ class RslRlVecEnvWrapper(gym.Wrapper): ...@@ -41,6 +47,9 @@ class RslRlVecEnvWrapper(gym.Wrapper):
def __init__(self, env: RLTaskEnv): def __init__(self, env: RLTaskEnv):
"""Initializes the wrapper. """Initializes the wrapper.
Note:
The wrapper calls :meth:`reset` at the start since the RSL-RL runner does not call reset.
Args: Args:
env: The environment to wrap around. env: The environment to wrap around.
...@@ -51,28 +60,74 @@ class RslRlVecEnvWrapper(gym.Wrapper): ...@@ -51,28 +60,74 @@ class RslRlVecEnvWrapper(gym.Wrapper):
if not isinstance(env.unwrapped, RLTaskEnv): if not isinstance(env.unwrapped, RLTaskEnv):
raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}") raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
# initialize the wrapper # initialize the wrapper
gym.Wrapper.__init__(self, env) self.env = env
# store information required by wrapper # store information required by wrapper
orbit_env: RLTaskEnv = self.env.unwrapped self.num_envs = self.unwrapped.num_envs
self.num_envs = orbit_env.num_envs self.device = self.unwrapped.device
self.num_actions = orbit_env.action_manager.total_action_dim self.max_episode_length = self.unwrapped.max_episode_length
self.num_obs = orbit_env.observation_manager.group_obs_dim["policy"][0] self.num_actions = self.unwrapped.action_manager.total_action_dim
self.num_obs = self.unwrapped.observation_manager.group_obs_dim["policy"][0]
# -- privileged observations
if "critic" in self.unwrapped.observation_manager.group_obs_dim:
self.num_privileged_obs = self.unwrapped.observation_manager.group_obs_dim["critic"][0]
else:
self.num_privileged_obs = 0
# reset at the start since the RSL-RL runner does not call reset # reset at the start since the RSL-RL runner does not call reset
self.env.reset() self.env.reset()
def __str__(self):
"""Returns the wrapper name and the :attr:`env` representation string."""
return f"<{type(self).__name__}{self.env}>"
def __repr__(self):
"""Returns the string representation of the wrapper."""
return str(self)
"""
Properties -- Gym.Wrapper
"""
@property
def render_mode(self) -> str | None:
"""Returns the :attr:`Env` :attr:`render_mode`."""
return self.env.render_mode
@property
def observation_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`observation_space`."""
return self.env.observation_space
@property
def action_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`action_space`."""
return self.env.action_space
@classmethod
def class_name(cls) -> str:
"""Returns the class name of the wrapper."""
return cls.__name__
@property
def unwrapped(self) -> RLTaskEnv:
"""Returns the base environment of the wrapper.
This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
"""
return self.env.unwrapped
""" """
Properties Properties
""" """
def get_observations(self) -> torch.Tensor: def get_observations(self) -> tuple[torch.Tensor, dict]:
"""Returns the current observations of the environment.""" """Returns the current observations of the environment."""
obs_dict = self.env.unwrapped.observation_manager.compute() obs_dict = self.unwrapped.observation_manager.compute()
return obs_dict["policy"], {"observations": obs_dict} return obs_dict["policy"], {"observations": obs_dict}
@property @property
def episode_length_buf(self) -> torch.Tensor: def episode_length_buf(self) -> torch.Tensor:
"""The episode length buffer.""" """The episode length buffer."""
return self.env.unwrapped.episode_length_buf return self.unwrapped.episode_length_buf
@episode_length_buf.setter @episode_length_buf.setter
def episode_length_buf(self, value: torch.Tensor): def episode_length_buf(self, value: torch.Tensor):
...@@ -80,22 +135,34 @@ class RslRlVecEnvWrapper(gym.Wrapper): ...@@ -80,22 +135,34 @@ class RslRlVecEnvWrapper(gym.Wrapper):
Note: This is needed to perform random initialization of episode lengths in RSL-RL. Note: This is needed to perform random initialization of episode lengths in RSL-RL.
""" """
self.env.unwrapped.episode_length_buf = value self.unwrapped.episode_length_buf = value
""" """
Operations - MDP Operations - MDP
""" """
def reset(self) -> tuple[torch.Tensor, dict]: def seed(self, seed: int = -1) -> int: # noqa: D102
return self.unwrapped.seed(seed)
def reset(self) -> tuple[torch.Tensor, dict]: # noqa: D102
# reset the environment # reset the environment
obs_dict = self.env.reset() obs_dict, _ = self.env.reset()
# return observations # return observations
return obs_dict["policy"], {"observations": obs_dict} return obs_dict["policy"], {"observations": obs_dict}
def step(self, actions: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, dict]: def step(self, actions: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, dict]:
# record step information # record step information
obs_dict, rew, dones, extras = self.env.step(actions) obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
# return step information # compute dones for compatibility with RSL-RL
dones = (terminated | truncated).to(dtype=torch.long)
# move extra observations to the extras dict
obs = obs_dict["policy"] obs = obs_dict["policy"]
extras["observations"] = obs_dict extras["observations"] = obs_dict
# move time out information to the extras dict
extras["time_outs"] = truncated
# return the step information
return obs, rew, dones, extras return obs, rew, dones, extras
def close(self): # noqa: D102
return self.env.close()
...@@ -93,9 +93,9 @@ Vectorized environment wrapper. ...@@ -93,9 +93,9 @@ Vectorized environment wrapper.
def SkrlVecEnvWrapper(env: RLTaskEnv): def SkrlVecEnvWrapper(env: RLTaskEnv):
"""Wraps around Isaac Orbit environment for skrl. """Wraps around Orbit environment for skrl.
This function wraps around the Isaac Orbit environment. Since the :class:`RLTaskEnv` environment This function wraps around the Orbit environment. Since the :class:`RLTaskEnv` environment
wrapping functionality is defined within the skrl library itself, this implementation wrapping functionality is defined within the skrl library itself, this implementation
is maintained for compatibility with the structure of the extension that contains it. is maintained for compatibility with the structure of the extension that contains it.
Internally it calls the :func:`wrap_env` from the skrl library API. Internally it calls the :func:`wrap_env` from the skrl library API.
......
...@@ -22,18 +22,21 @@ INSTALL_REQUIRES = [ ...@@ -22,18 +22,21 @@ INSTALL_REQUIRES = [
"numpy", "numpy",
"torch", "torch",
"torchvision>=0.14.1", # ensure compatibility with torch 1.13.1 "torchvision>=0.14.1", # ensure compatibility with torch 1.13.1
"protobuf==3.20.2", "protobuf>=3.20.2",
# data collection # data collection
"h5py", "h5py",
# basic logger
"tensorboard",
# video recording
"moviepy",
] ]
# Extra dependencies for RL agents # Extra dependencies for RL agents
EXTRAS_REQUIRE = { EXTRAS_REQUIRE = {
"sb3": ["stable-baselines3>=1.5,<=1.8", "tensorboard"], "sb3": ["stable-baselines3>=2.0"],
"skrl": ["skrl>=0.10.0"], "skrl": ["skrl>=0.10.0"],
"rl_games": ["rl-games==1.5.2"], "rl_games": ["rl-games==1.6.1"],
# TODO: Uncomment when rsl_rl is updated to public. "rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
# "rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
"robomimic": ["robomimic@git+https://github.com/ARISE-Initiative/robomimic.git"], "robomimic": ["robomimic@git+https://github.com/ARISE-Initiative/robomimic.git"],
} }
# cumulation of all extra-requires # cumulation of all extra-requires
...@@ -43,7 +46,7 @@ EXTRAS_REQUIRE["all"] = list(itertools.chain.from_iterable(EXTRAS_REQUIRE.values ...@@ -43,7 +46,7 @@ EXTRAS_REQUIRE["all"] = list(itertools.chain.from_iterable(EXTRAS_REQUIRE.values
# Installation operation # Installation operation
setup( setup(
name="omni-isaac-orbit_tasks", name="omni-isaac-orbit_tasks",
author="NVIDIA, ETH Zurich, and University of Toronto", author="ORBIT Project Developers",
maintainer="Mayank Mittal", maintainer="Mayank Mittal",
maintainer_email="mittalma@ethz.ch", maintainer_email="mittalma@ethz.ch",
url=EXTENSION_TOML_DATA["package"]["repository"], url=EXTENSION_TOML_DATA["package"]["repository"],
...@@ -55,6 +58,10 @@ setup( ...@@ -55,6 +58,10 @@ setup(
install_requires=INSTALL_REQUIRES, install_requires=INSTALL_REQUIRES,
extras_require=EXTRAS_REQUIRE, extras_require=EXTRAS_REQUIRE,
packages=["omni.isaac.orbit_tasks"], packages=["omni.isaac.orbit_tasks"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"], classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False, zip_safe=False,
) )
...@@ -20,8 +20,7 @@ simulation_app = app_launcher.app ...@@ -20,8 +20,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import gym.envs
import torch import torch
import traceback import traceback
import unittest import unittest
...@@ -42,7 +41,7 @@ class TestEnvironments(unittest.TestCase): ...@@ -42,7 +41,7 @@ class TestEnvironments(unittest.TestCase):
def setUpClass(cls): def setUpClass(cls):
# acquire all Isaac environments names # acquire all Isaac environments names
cls.registered_tasks = list() cls.registered_tasks = list()
for task_spec in gym.envs.registry.all(): for task_spec in gym.registry.values():
if "Isaac" in task_spec.id: if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id) cls.registered_tasks.append(task_spec.id)
# sort environments by name # sort environments by name
...@@ -70,19 +69,20 @@ class TestEnvironments(unittest.TestCase): ...@@ -70,19 +69,20 @@ class TestEnvironments(unittest.TestCase):
env: RLTaskEnv = gym.make(task_name, cfg=env_cfg) env: RLTaskEnv = gym.make(task_name, cfg=env_cfg)
# reset environment # reset environment
obs = env.reset() obs, _ = env.reset()
# check signal # check signal
self.assertTrue(self._check_valid_tensor(obs)) self.assertTrue(self._check_valid_tensor(obs))
# simulate environment for 1000 steps # simulate environment for 1000 steps
for _ in range(1000): with torch.inference_mode():
# sample actions from -1 to 1 for _ in range(1000):
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1 # sample actions from -1 to 1
# apply actions actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
transition = env.step(actions) # apply actions
# check signals transition = env.step(actions)
for data in transition: # check signals
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}") for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment # close the environment
print(f">>> Closing environment: {task_name}") print(f">>> Closing environment: {task_name}")
...@@ -108,9 +108,9 @@ class TestEnvironments(unittest.TestCase): ...@@ -108,9 +108,9 @@ class TestEnvironments(unittest.TestCase):
valid_tensor = True valid_tensor = True
for value in data.values(): for value in data.values():
if isinstance(value, dict): if isinstance(value, dict):
return TestEnvironments._check_valid_tensor(value) valid_tensor &= TestEnvironments._check_valid_tensor(value)
elif isinstance(value, torch.Tensor): elif isinstance(value, torch.Tensor):
valid_tensor = valid_tensor and not torch.any(torch.isnan(value)) valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor return valid_tensor
else: else:
raise ValueError(f"Input data of invalid type: {type(data)}.") raise ValueError(f"Input data of invalid type: {type(data)}.")
......
...@@ -19,7 +19,7 @@ simulation_app = app_launcher.app ...@@ -19,7 +19,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import os import os
import torch import torch
import traceback import traceback
...@@ -42,7 +42,7 @@ class TestRecordVideoWrapper(unittest.TestCase): ...@@ -42,7 +42,7 @@ class TestRecordVideoWrapper(unittest.TestCase):
def setUpClass(cls): def setUpClass(cls):
# acquire all Isaac environments names # acquire all Isaac environments names
cls.registered_tasks = list() cls.registered_tasks = list()
for task_spec in gym.envs.registry.all(): for task_spec in gym.registry.values():
if "Isaac" in task_spec.id: if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id) cls.registered_tasks.append(task_spec.id)
# sort environments by name # sort environments by name
...@@ -73,25 +73,24 @@ class TestRecordVideoWrapper(unittest.TestCase): ...@@ -73,25 +73,24 @@ class TestRecordVideoWrapper(unittest.TestCase):
env_cfg.sim.shutdown_app_on_stop = False env_cfg.sim.shutdown_app_on_stop = False
# create environment # create environment
env: RLTaskEnv = gym.make(task_name, cfg=env_cfg) env: RLTaskEnv = gym.make(task_name, cfg=env_cfg, render_mode="rgb_array")
# directory to save videos # directory to save videos
videos_dir = os.path.join(self.videos_dir, task_name) videos_dir = os.path.join(self.videos_dir, task_name)
# wrap environment to record videos # wrap environment to record videos
env = gym.wrappers.RecordVideo( env = gym.wrappers.RecordVideo(
env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length, disable_logger=True
) )
# reset environment # reset environment
env.reset() env.reset()
# simulate environment # simulate environment
for _ in range(500): with torch.inference_mode():
# compute zero actions for _ in range(500):
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1 # compute zero actions
# apply actions actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
_ = env.step(actions) # apply actions
# render environment _ = env.step(actions)
env.render(mode="human")
# close the simulator # close the simulator
env.close() env.close()
......
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.rl_games import RlGamesVecEnvWrapper
class TestRlGamesVecEnvWrapper(unittest.TestCase):
"""Test that RL-Games VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = RlGamesVecEnvWrapper(env, "cuda:0", 100, 100)
# reset environment
obs = env.reset()
# check signal
self.assertTrue(self._check_valid_tensor(obs))
# simulate environment for 100 steps
with torch.inference_mode():
for _ in range(100):
# sample actions from -1 to 1
actions = 2 * torch.rand(env.action_space.shape, device=env.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, torch.Tensor):
return not torch.any(torch.isnan(data))
elif isinstance(data, dict):
valid_tensor = True
for value in data.values():
if isinstance(value, dict):
valid_tensor &= TestRlGamesVecEnvWrapper._check_valid_tensor(value)
elif isinstance(value, torch.Tensor):
valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.rsl_rl import RslRlVecEnvWrapper
class TestRslRlVecEnvWrapper(unittest.TestCase):
"""Test that RSL-RL VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = RslRlVecEnvWrapper(env)
# reset environment
obs, extras = env.reset()
# check signal
self.assertTrue(self._check_valid_tensor(obs))
self.assertTrue(self._check_valid_tensor(extras))
# simulate environment for 1000 steps
with torch.inference_mode():
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, torch.Tensor):
return not torch.any(torch.isnan(data))
elif isinstance(data, dict):
valid_tensor = True
for value in data.values():
if isinstance(value, dict):
valid_tensor &= TestRslRlVecEnvWrapper._check_valid_tensor(value)
elif isinstance(value, torch.Tensor):
valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import numpy as np
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.sb3 import Sb3VecEnvWrapper
class TestStableBaselines3VecEnvWrapper(unittest.TestCase):
"""Test that RSL-RL VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = Sb3VecEnvWrapper(env)
# reset environment
obs = env.reset()
# check signal
self.assertTrue(self._check_valid_array(obs))
# simulate environment for 1000 steps
with torch.inference_mode():
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * np.random.rand(env.num_envs, env.action_space.shape) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_array(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_array(data: np.ndarray | dict | list) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, np.ndarray):
return not np.any(np.isnan(data))
elif isinstance(data, dict):
valid_array = True
for value in data.values():
if isinstance(value, dict):
valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
elif isinstance(value, np.ndarray):
valid_array &= not np.any(np.isnan(value))
return valid_array
elif isinstance(data, list):
valid_array = True
for value in data:
valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
return valid_array
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
...@@ -27,7 +27,7 @@ simulation_app = app_launcher.app ...@@ -27,7 +27,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
from prettytable import PrettyTable from prettytable import PrettyTable
import omni.isaac.contrib_tasks # noqa: F401 import omni.isaac.contrib_tasks # noqa: F401
...@@ -47,10 +47,10 @@ def main(): ...@@ -47,10 +47,10 @@ def main():
# count of environments # count of environments
index = 0 index = 0
# acquire all Isaac environments names # acquire all Isaac environments names
for task_spec in gym.envs.registry.all(): for task_spec in gym.registry.values():
if "Isaac" in task_spec.id: if "Isaac" in task_spec.id:
# add details to table # add details to table
table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec._kwargs["env_cfg_entry_point"]]) table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec.kwargs["env_cfg_entry_point"]])
# increment count # increment count
index += 1 index += 1
...@@ -61,6 +61,8 @@ if __name__ == "__main__": ...@@ -61,6 +61,8 @@ if __name__ == "__main__":
try: try:
# run the main function # run the main function
main() main()
except Exception as e:
raise e
finally: finally:
# close the app # close the app
simulation_app.close() simulation_app.close()
...@@ -15,7 +15,7 @@ import argparse ...@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher from omni.isaac.orbit.app import AppLauncher
# add argparse arguments # add argparse arguments
parser = argparse.ArgumentParser(description="Random agent for Isaac Orbit environments.") parser = argparse.ArgumentParser(description="Random agent for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.") parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.") parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.") parser.add_argument("--task", type=str, default=None, help="Name of the task.")
...@@ -31,7 +31,7 @@ simulation_app = app_launcher.app ...@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
...@@ -43,12 +43,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg ...@@ -43,12 +43,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main(): def main():
"""Random actions agent with Isaac Orbit environment.""" """Random actions agent with Orbit environment."""
# parse configuration # parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs) env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
# create environment # create environment
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg)
# print info (this is vectorized environment)
print(f"[INFO]: Gym observation space: {env.observation_space}")
print(f"[INFO]: Gym action space: {env.action_space}")
# reset environment # reset environment
env.reset() env.reset()
# simulate environment # simulate environment
...@@ -56,9 +59,9 @@ def main(): ...@@ -56,9 +59,9 @@ def main():
# run everything in inference mode # run everything in inference mode
with torch.inference_mode(): with torch.inference_mode():
# sample actions from -1 to 1 # sample actions from -1 to 1
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1 actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions # apply actions
_, _, _, _ = env.step(actions) env.step(actions)
# close the simulator # close the simulator
env.close() env.close()
......
...@@ -36,7 +36,7 @@ simulation_app = app_launcher.app ...@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
"""Rest everything else.""" """Rest everything else."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
from enum import Enum from enum import Enum
......
...@@ -15,7 +15,7 @@ import argparse ...@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher from omni.isaac.orbit.app import AppLauncher
# add argparse arguments # add argparse arguments
parser = argparse.ArgumentParser(description="Keyboard teleoperation for Isaac Orbit environments.") parser = argparse.ArgumentParser(description="Keyboard teleoperation for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.") parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.") parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
parser.add_argument("--device", type=str, default="keyboard", help="Device for interacting with environment") parser.add_argument("--device", type=str, default="keyboard", help="Device for interacting with environment")
...@@ -33,7 +33,7 @@ simulation_app = app_launcher.app ...@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
......
...@@ -15,7 +15,7 @@ import argparse ...@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher from omni.isaac.orbit.app import AppLauncher
# add argparse arguments # add argparse arguments
parser = argparse.ArgumentParser(description="Zero agent for Isaac Orbit environments.") parser = argparse.ArgumentParser(description="Zero agent for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.") parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.") parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.") parser.add_argument("--task", type=str, default=None, help="Name of the task.")
...@@ -30,7 +30,7 @@ simulation_app = app_launcher.app ...@@ -30,7 +30,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
...@@ -42,12 +42,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg ...@@ -42,12 +42,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main(): def main():
"""Zero actions agent with Isaac Orbit environment.""" """Zero actions agent with Orbit environment."""
# parse configuration # parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs) env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
# create environment # create environment
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg)
# print info (this is vectorized environment)
print(f"[INFO]: Gym observation space: {env.observation_space}")
print(f"[INFO]: Gym action space: {env.action_space}")
# reset environment # reset environment
env.reset() env.reset()
# simulate environment # simulate environment
...@@ -55,9 +58,9 @@ def main(): ...@@ -55,9 +58,9 @@ def main():
# run everything in inference mode # run everything in inference mode
with torch.inference_mode(): with torch.inference_mode():
# compute zero actions # compute zero actions
actions = torch.zeros((env.num_envs, env.action_space.shape[0]), device=env.device) actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
# apply actions # apply actions
_, _, _, _ = env.step(actions) env.step(actions)
# close the simulator # close the simulator
env.close() env.close()
......
...@@ -37,7 +37,7 @@ simulation_app = app_launcher.app ...@@ -37,7 +37,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import math import math
import os import os
import torch import torch
......
...@@ -41,7 +41,7 @@ simulation_app = app_launcher.app ...@@ -41,7 +41,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import math import math
import os import os
import traceback import traceback
...@@ -96,13 +96,14 @@ def main(): ...@@ -96,13 +96,14 @@ def main():
clip_actions = agent_cfg["params"]["env"].get("clip_actions", math.inf) clip_actions = agent_cfg["params"]["env"].get("clip_actions", math.inf)
# create isaac environment # create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording # wrap for video recording
if args_cli.video: if args_cli.video:
video_kwargs = { video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"), "video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0, "step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length, "video_length": args_cli.video_length,
"disable_logger": True,
} }
print("[INFO] Recording videos during training.") print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4) print_dict(video_kwargs, nesting=4)
......
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
"""Script to collect demonstrations with Isaac Orbit environments.""" """Script to collect demonstrations with Orbit environments."""
from __future__ import annotations from __future__ import annotations
...@@ -15,7 +15,7 @@ import argparse ...@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher from omni.isaac.orbit.app import AppLauncher
# add argparse arguments # add argparse arguments
parser = argparse.ArgumentParser(description="Collect demonstrations for Isaac Orbit environments.") parser = argparse.ArgumentParser(description="Collect demonstrations for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.") parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.") parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.") parser.add_argument("--task", type=str, default=None, help="Name of the task.")
...@@ -35,7 +35,7 @@ simulation_app = app_launcher.app ...@@ -35,7 +35,7 @@ simulation_app = app_launcher.app
import contextlib import contextlib
import gym import gymnasium as gym
import os import os
import torch import torch
import traceback import traceback
......
...@@ -15,7 +15,7 @@ import argparse ...@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher from omni.isaac.orbit.app import AppLauncher
# add argparse arguments # add argparse arguments
parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Isaac Orbit environments.") parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.") parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.") parser.add_argument("--task", type=str, default=None, help="Name of the task.")
parser.add_argument("--checkpoint", type=str, default=None, help="Pytorch model checkpoint to load.") parser.add_argument("--checkpoint", type=str, default=None, help="Pytorch model checkpoint to load.")
...@@ -31,7 +31,7 @@ simulation_app = app_launcher.app ...@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
...@@ -46,7 +46,7 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg ...@@ -46,7 +46,7 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main(): def main():
"""Run a trained policy from robomimic with Isaac Orbit environment.""" """Run a trained policy from robomimic with Orbit environment."""
# parse configuration # parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=1) env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=1)
# modify configuration # modify configuration
......
...@@ -54,7 +54,7 @@ simulation_app = app_launcher.app ...@@ -54,7 +54,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import argparse import argparse
import gym import gymnasium as gym
import json import json
import numpy as np import numpy as np
import os import os
......
...@@ -36,7 +36,7 @@ simulation_app = app_launcher.app ...@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import os import os
import torch import torch
import traceback import traceback
......
...@@ -47,7 +47,7 @@ simulation_app = app_launcher.app ...@@ -47,7 +47,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import os import os
import torch import torch
import traceback import traceback
...@@ -88,13 +88,14 @@ def main(): ...@@ -88,13 +88,14 @@ def main():
log_dir = os.path.join(log_root_path, log_dir) log_dir = os.path.join(log_root_path, log_dir)
# create isaac environment # create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording # wrap for video recording
if args_cli.video: if args_cli.video:
video_kwargs = { video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"), "video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0, "step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length, "video_length": args_cli.video_length,
"disable_logger": True,
} }
print("[INFO] Recording videos during training.") print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4) print_dict(video_kwargs, nesting=4)
......
...@@ -33,7 +33,7 @@ simulation_app = app_launcher.app ...@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
......
...@@ -43,7 +43,7 @@ simulation_app = app_launcher.app ...@@ -43,7 +43,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import os import os
import traceback import traceback
from datetime import datetime from datetime import datetime
...@@ -95,6 +95,7 @@ def main(): ...@@ -95,6 +95,7 @@ def main():
"video_folder": os.path.join(log_dir, "videos"), "video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0, "step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length, "video_length": args_cli.video_length,
"disable_logger": True,
} }
print("[INFO] Recording videos during training.") print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4) print_dict(video_kwargs, nesting=4)
......
...@@ -38,7 +38,7 @@ simulation_app = app_launcher.app ...@@ -38,7 +38,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import torch import torch
import traceback import traceback
......
...@@ -48,7 +48,7 @@ simulation_app = app_launcher.app ...@@ -48,7 +48,7 @@ simulation_app = app_launcher.app
"""Rest everything follows.""" """Rest everything follows."""
import gym import gymnasium as gym
import traceback import traceback
from datetime import datetime from datetime import datetime
...@@ -97,13 +97,14 @@ def main(): ...@@ -97,13 +97,14 @@ def main():
dump_pickle(os.path.join(log_dir, "params", "agent.pkl"), experiment_cfg) dump_pickle(os.path.join(log_dir, "params", "agent.pkl"), experiment_cfg)
# create isaac environment # create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording # wrap for video recording
if args_cli.video: if args_cli.video:
video_kwargs = { video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"), "video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0, "step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length, "video_length": args_cli.video_length,
"disable_logger": True,
} }
print("[INFO] Recording videos during training.") print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4) print_dict(video_kwargs, nesting=4)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment