Unverified Commit cd2c4f1d authored by Mayank Mittal's avatar Mayank Mittal Committed by GitHub

Upgrades environments from Gym 0.21 to Gymnasium 0.29 (#234)

# Description

Currently, we are downgrading many libraries to be able to use the Gym
0.21.0 version. However, this is not great and is causing issues
installing new Python packages, as highlighted in #204. It is becoming a
more significant issue with Python 3.10 in Isaac Sim 2023.1.

This MR upgrades the repository to use the Gymnasium Environment class.

## Type of Change

- Bug fix (non-breaking change which fixes an issue)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./orbit.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------
Signed-off-by: 's avatarMayank Mittal <12863862+Mayankm96@users.noreply.github.com>
Co-authored-by: 's avatarDavid Hoeller <dhoeller@ethz.ch>
parent e5b43e96
......@@ -4,7 +4,7 @@ omni.isaac.orbit_tasks.isaac_env
We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`.
......@@ -17,12 +17,12 @@ class. This is done using the function :meth:`load_default_env_cfg` in the sub-m
.. code-block:: python
import gym
import gymnasium as gym
import omni.isaac.orbit_tasks
from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
task_name = "Isaac-Cartpole-v0"
cfg = load_default_env_cfg(task_name)
cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
env = gym.make(task_name, cfg=cfg)
......
Known issues
============
Installation errors due to gym==0.21.0
--------------------------------------
When installing the gym package, you may encounter the following error:
.. code-block::
error in gym setup command: 'extras_require' must be a dictionary whose values are strings or lists of
strings containing valid project/version requirement specifiers.
----------------------------------------
ERROR: Could not find a version that satisfies the requirement gym==0.21.0 (from omni-isaac-orbit-envs[all])
(from versions: 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6,
...
0.15.7, 0.16.0, 0.17.0, 0.17.1, 0.17.2, 0.17.3, 0.18.0, 0.18.3, 0.19.0, 0.20.0, 0.21.0, 0.22.0, 0.23.0,
0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.25.1, 0.25.2, 0.26.0, 0.26.1, 0.26.2)
ERROR: No matching distribution found for gym==0.21.0
This issue arises since the ``setuptools`` package from version 67.0 onwards does not support malformed version strings.
Since the OpenAI Gym package that is no longer being maintained (`issue link <https://github.com/openai/gym/issues/3200>`_),
the current workaround is to install the ``setuptools`` package version 66.0.0. You can do this by running the following
command:
.. code-block:: bash
./orbit.sh -p -m pip install -U setuptools==66
Regression in Isaac Sim 2022.2.1
--------------------------------
......
......@@ -157,7 +157,7 @@ utilities to manage extensions:
optional arguments:
-h, --help Display the help content.
-i, --install Install the extensions inside Isaac Orbit.
-i, --install Install the extensions inside Orbit.
-e, --extra Install extra dependencies such as the learning frameworks.
-f, --format Run pre-commit to format the code and check lints.
-p, --python Run the python executable (python.sh) provided by Isaac Sim.
......
......@@ -141,7 +141,7 @@ format.
.. code:: bash
# install python module (for robomimic)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[robomimic]'
./orbit.sh -e robomimic
# split data
./orbit.sh -p source/standalone//workflows/robomimic/tools/split_train_val.py logs/robomimic/Isaac-Lift-Franka-v0/hdf_dataset.hdf5 --ratio 0.2
......@@ -171,7 +171,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash
# install python module (for stable-baselines3)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[sb3]'
./orbit.sh -e sb3
# run script for training
# note: we enable cpu flag since SB3 doesn't optimize for GPU anyway
./orbit.sh -p source/standalone/workflows/sb3/train.py --task Isaac-Cartpole-v0 --headless --cpu
......@@ -184,7 +184,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash
# install python module (for skrl)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[skrl]'
./orbit.sh -e skrl
# run script for training
./orbit.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless
# run script for playing with 32 environments
......@@ -196,7 +196,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash
# install python module (for rl-games)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rl_games]'
./orbit.sh -e rl_games
# run script for training
./orbit.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-Ant-v0 --headless
# run script for playing with 32 environments
......@@ -208,7 +208,7 @@ from the environments into the respective libraries function argument and return
.. code:: bash
# install python module (for rsl-rl)
./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rsl_rl]'
./orbit.sh -e rsl_rl
# run script for training
./orbit.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Reach-Franka-v0 --headless
# run script for playing with 32 environments
......
......@@ -39,11 +39,12 @@ an environment by calling ``gym.make``. The environments are registered in the `
gym.register(
id="Isaac-Cartpole-v0",
entry_point="omni.isaac.orbit_tasks.classic.cartpole:CartpoleEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"},
disable_env_checker=True,
kwargs={"env_cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"},
)
The ``cfg_entry_point`` argument is used to load the default configuration for the environment. The default
configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_default_env_cfg` function.
The ``env_cfg_entry_point`` argument is used to load the default configuration for the environment. The default
configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_cfg_from_registry` function.
The configuration entry point can correspond to both a YAML file or a python configuration
class. The default configuration can be overridden by passing a custom configuration instance to the ``gym.make``
function as shown later in the tutorial.
......
......@@ -26,13 +26,13 @@ For example, here is how you would wrap an environment to enforce that reset is
"""Rest everything follows."""
import gym
import gymnasium as gym
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils import load_default_env_cfg
from omni.isaac.orbit_tasks.utils import load_cfg_from_registry
# create base environment
cfg = load_default_env_cfg("Isaac-Reach-Franka-v0")
cfg = load_cfg_from_registry("Isaac-Reach-Franka-v0", "env_cfg_entry_point")
env = gym.make("Isaac-Reach-Franka-v0", cfg=cfg)
# wrap environment to enforce that reset is called before step
env = gym.wrappers.OrderEnforcing(env)
......@@ -105,7 +105,7 @@ for 200 steps, and saves it in the ``videos`` folder at a step interval of 1500
"""Rest everything follows."""
import gym
import gymnasium as gym
# adjust camera resolution and pose
env_cfg.viewer.resolution = (640, 480)
......
......@@ -185,7 +185,7 @@ print_help () {
echo -e "\nusage: $(basename "$0") [-h] [-i] [-e] [-f] [-p] [-s] [-o] [-v] [-d] [-c] -- Utility to manage extensions in Orbit."
echo -e "\noptional arguments:"
echo -e "\t-h, --help Display the help content."
echo -e "\t-i, --install Install the extensions inside Isaac Orbit."
echo -e "\t-i, --install Install the extensions inside Orbit."
echo -e "\t-e, --extra Install extra dependencies such as the learning frameworks."
echo -e "\t-f, --format Run pre-commit to format the code and check lints."
echo -e "\t-p, --python Run the python executable (python.sh) provided by Isaac Sim."
......@@ -220,9 +220,6 @@ while [[ $# -gt 0 ]]; do
# this does not check dependencies between extensions
export -f extract_python_exe
export -f install_orbit_extension
# downgrade setuptools to avoid issues with OpenAI Gym
# Check the `Known Issues` section in the documentation
$(extract_python_exe) -m pip install --upgrade setuptools==66
# source directory
find -L "${ORBIT_PATH}/source/extensions" -mindepth 1 -maxdepth 1 -type d -exec bash -c 'install_orbit_extension "{}"' \;
# unset local variables
......@@ -235,8 +232,17 @@ while [[ $# -gt 0 ]]; do
# install the python packages for supported reinforcement learning frameworks
echo "[INFO] Installing extra requirements such as learning frameworks..."
python_exe=$(extract_python_exe)
# check if specified which rl-framework to install
if [ -z "$2" ]; then
echo "[INFO] Installing all rl-frameworks..."
framework_name="all"
else
echo "[INFO] Installing rl-framework: $2"
framework_name=$2
shift # past argument
fi
# install the rl-frameworks specified
${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks[all]
${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks["${framework_name}"]
shift # past argument
;;
-c|--conda)
......
......@@ -27,7 +27,7 @@ extra_standard_library = [
"tensordict",
"bpy",
"matplotlib",
"gym",
"gymnasium",
"scipy",
"hid",
"yaml",
......
......@@ -18,9 +18,12 @@ itself. However, its various instances should be included in directories within
The environments should then be registered in the `omni/isaac/contrib_tasks/__init__.py`:
```python
import gymnasium as gym
gym.register(
id="Isaac-Contrib-<my-awesome-env>-v0",
entry_point="omni.isaac.contrib_tasks.<your-env-package>:<your-env-class>",
disable_env_checker=True,
kwargs={"cfg_entry_point": "omni.isaac.contrib_tasks.<your-env-package-cfg>:<your-env-class-cfg>"},
)
```
......@@ -9,7 +9,7 @@
We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`.
Note:
......@@ -18,18 +18,18 @@ Note:
the kwarg argument :obj:`cfg` while creating the environment.
Usage:
>>> import gym
>>> import gymnasium as gym
>>> import omni.isaac.contrib_tasks
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
>>>
>>> task_name = "Isaac-Contrib-<my-registered-env-name>-v0"
>>> cfg = load_default_env_cfg(task_name)
>>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
>>> env = gym.make(task_name, cfg=cfg)
"""
from __future__ import annotations
import gym # noqa: F401
import gymnasium as gym # noqa: F401
import os
import toml
......
......@@ -28,6 +28,10 @@ setup(
include_package_data=True,
python_requires=">=3.7",
packages=["omni.isaac.contrib_tasks"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False,
)
[package]
# Note: Semantic Versioning is used: https://semver.org/
version = "0.9.37"
version = "0.9.38"
# Description
title = "ORBIT framework for Robot Learning"
......
Changelog
---------
0.9.38 (2023-11-07)
~~~~~~~~~~~~~~~~~~~
Changed
^^^^^^^
* Upgraded the :class:`omni.isaac.orbit.envs.RLTaskEnv` class to support Gym 0.29.0 environment definition.
Added
^^^^^
* Added computation of ``time_outs`` and ``terminated`` signals inside the termination manager. These follow the
definition mentioned in `Gym 0.29.0 <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_.
* Added proper handling of observation and action spaces in the :class:`omni.isaac.orbit.envs.RLTaskEnv` class.
These now follow closely to how Gym VecEnv handles the spaces.
0.9.37 (2023-11-06)
~~~~~~~~~~~~~~~~~~~
......
......@@ -91,6 +91,11 @@ class ObservationManager(ManagerBase):
"""Shape of observation tensor for each term in each group."""
return self._group_obs_term_dim
@property
def group_obs_concatenate(self) -> dict[str, bool]:
"""Whether the observation terms are concatenated in each group."""
return self._group_obs_concatenate
"""
Operations.
"""
......
......@@ -26,8 +26,20 @@ class TerminationManager(ManagerBase):
argument and returns a boolean tensor of shape ``(num_envs,)``. The termination manager
computes the termination signal as the union (logical or) of all the termination terms.
Following the `Gymnasium API <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_,
the termination signal is computed as the logical OR of the following signals:
* **Time-out**: This signal is set to true if the environment has ended after an externally defined condition
(that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
timed out (i.e. reached max episode length).
* **Terminated**: This signal is set to true if the environment has reached a terminal state defined by the
environment. This state may correspond to task success, task failure, robot falling, etc.
These signals can be individually accessed using the :attr:`time_outs` and :attr:`terminated` properties.
The termination terms are parsed from a config class containing the manager's settings and each term's
parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class.
parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class. The term's
configuration :attr:`TerminationTermCfg.time_out` decides whether the term is a timeout or a termination term.
"""
_env: RLTaskEnv
......@@ -46,8 +58,8 @@ class TerminationManager(ManagerBase):
for term_name in self._term_names:
self._episode_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
# create buffer for managing termination per environment
self._done_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
self._time_out_buf = torch.zeros_like(self._done_buf)
self._truncated_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
self._terminated_buf = torch.zeros_like(self._truncated_buf)
def __str__(self) -> str:
"""Returns: A string representation for termination manager."""
......@@ -79,12 +91,26 @@ class TerminationManager(ManagerBase):
@property
def dones(self) -> torch.Tensor:
"""The net termination signal. Shape is ``(num_envs,)``."""
return self._done_buf
return self._truncated_buf | self._terminated_buf
@property
def time_outs(self) -> torch.Tensor:
"""The timeout signal. Shape is ``(num_envs,)``."""
return self._time_out_buf
"""The timeout signal (reaching max episode length). Shape is ``(num_envs,)``.
This signal is set to true if the environment has ended after an externally defined condition
(that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
timed out (i.e. reached max episode length).
"""
return self._truncated_buf
@property
def terminated(self) -> torch.Tensor:
"""The terminated signal (reaching a terminal state). Shape is ``(num_envs,)``.
This signal is set to true if the environment has reached a terminal state defined by the environment.
This state may correspond to task success, task failure, robot falling, etc.
"""
return self._terminated_buf
"""
Operations.
......@@ -122,20 +148,20 @@ class TerminationManager(ManagerBase):
The combined termination signal of shape ``(num_envs,)``.
"""
# reset computation
self._done_buf[:] = False
self._time_out_buf[:] = False
self._truncated_buf[:] = False
self._terminated_buf[:] = False
# iterate over all the termination terms
for name, term_cfg in zip(self._term_names, self._term_cfgs):
value = term_cfg.func(self._env, **term_cfg.params)
# update total termination
self._done_buf |= value
# store timeout signal separately
if term_cfg.time_out:
self._time_out_buf |= value
self._truncated_buf |= value
else:
self._terminated_buf |= value
# add to episode dones
self._episode_dones[name] |= value
# return termination signal
return self._done_buf
# return combined termination signal
return self._truncated_buf | self._terminated_buf
"""
Operations - Term settings.
......
......@@ -292,13 +292,13 @@ class SimulationContext(_SimulationContext):
# hide the viewport and disable updates
self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess]
self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess]
# reset the throttle counter
self._render_throttle_counter = 0
elif mode == self.RenderMode.NO_RENDERING:
# hide the viewport and disable updates
if self._viewport_context is not None:
self._viewport_context.updates_enabled = False # pyright: ignore [reportOptionalMemberAccess]
self._viewport_window.visible = False # pyright: ignore [reportOptionalMemberAccess]
# reset the throttle counter
self._render_throttle_counter = 0
else:
raise ValueError(f"Unsupported render mode: {mode}! Please check `RenderMode` for details.")
# update render mode
......@@ -403,14 +403,21 @@ class SimulationContext(_SimulationContext):
self._render_throttle_counter += 1
if self._render_throttle_counter % self._render_throttle_period == 0:
self._render_throttle_counter = 0
# here we don't render viewport so don't need to flush flatcache
super().render()
# here we don't render viewport so don't need to flush fabric data
# note: we don't call super().render() anymore because they do flush the fabric data
self.set_setting("/app/player/playSimulations", False)
self._app.update()
self.set_setting("/app/player/playSimulations", True)
else:
# manually flush the flatcache data to update Hydra textures
# manually flush the fabric data to update Hydra textures
if self._fabric_iface is not None:
self._fabric_iface.update(0.0, 0.0)
# render the simulation
super().render()
# note: we don't call super().render() anymore because they do above operation inside
# and we don't want to do it twice. We may remove it once we drop support for Isaac Sim 2022.2.
self.set_setting("/app/player/playSimulations", False)
self._app.update()
self.set_setting("/app/player/playSimulations", True)
"""
Operations - Override (extension)
......
......@@ -25,18 +25,17 @@ INSTALL_REQUIRES = [
# devices
"hidapi",
# gym
"gym==0.21.0",
"importlib-metadata~=4.13.0",
"setuptools<=66", # setuptools 67.0 breaks gym
"gymnasium==0.29.0",
# procedural-generation
"trimesh",
"pyglet==1.5.27", # pyglet 2.0 requires python 3.8
"pyglet==1.5.27; python_version < '3.8'", # pyglet 2.0 requires python 3.8
"pyglet; python_version >= '3.8'",
]
# Installation operation
setup(
name="omni-isaac-orbit",
author="NVIDIA, ETH Zurich, and University of Toronto",
author="ORBIT Project Developers",
maintainer="Mayank Mittal",
maintainer_email="mittalma@ethz.ch",
url=EXTENSION_TOML_DATA["package"]["repository"],
......@@ -48,6 +47,10 @@ setup(
python_requires=">=3.7",
install_requires=INSTALL_REQUIRES,
packages=["omni.isaac.orbit"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False,
)
......@@ -6,6 +6,7 @@
from __future__ import annotations
import torch
import torch.utils.benchmark as benchmark
import unittest
......@@ -124,6 +125,30 @@ class TestTorchOperations(unittest.TestCase):
my_slice = my_tensor[torch.tensor([0, 1]), ...]
self.assertNotEqual(my_slice.untyped_storage().data_ptr(), my_tensor.untyped_storage().data_ptr())
def test_logical_or(self):
"""Test bitwise or operation."""
size = (400, 300, 5)
my_tensor_1 = torch.rand(size, device="cuda:0") > 0.5
my_tensor_2 = torch.rand(size, device="cuda:0") < 0.5
# check the speed of logical or
timer_logical_or = benchmark.Timer(
stmt="torch.logical_or(my_tensor_1, my_tensor_2)",
globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2},
)
timer_bitwise_or = benchmark.Timer(
stmt="my_tensor_1 | my_tensor_2", globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2}
)
print("Time for logical or:", timer_logical_or.timeit(number=1000))
print("Time for bitwise or:", timer_bitwise_or.timeit(number=1000))
# check that logical or works as expected
output_logical_or = torch.logical_or(my_tensor_1, my_tensor_2)
output_bitwise_or = my_tensor_1 | my_tensor_2
self.assertTrue(torch.allclose(output_logical_or, output_bitwise_or))
if __name__ == "__main__":
unittest.main()
[package]
# Note: Semantic Versioning is used: https://semver.org/
version = "0.5.0"
version = "0.5.1"
# Description
title = "ORBIT Environments"
......
Changelog
---------
0.5.1 (2023-11-04)
~~~~~~~~~~~~~~~~~~
Fixed
^^^^^
* Fixed the wrappers to different learning frameworks to use the new :class:`omni.isaac.orbit_tasks.RLTaskEnv` class.
The :class:`RLTaskEnv` class inherits from the :class:`gymnasium.Env` class (Gym 0.29.0).
* Fixed the registration of tasks in the Gym registry based on Gym 0.29.0 API.
Changed
^^^^^^^
* Removed the inheritance of all the RL-framework specific wrappers from the :class:`gymnasium.Wrapper` class.
This is because the wrappers don't comply with the new Gym 0.29.0 API. The wrappers are now only inherit
from their respective RL-framework specific base classes.
0.5.0 (2023-10-30)
~~~~~~~~~~~~~~~~~~
......
......@@ -17,28 +17,31 @@ This looks like as follows:
omni/isaac/orbit_tasks/locomotion/
├── __init__.py
└── velocity
├── a1
│ └── flat_terrain_cfg.py
├── anymal_c
│ └── flat_terrain_cfg.py
├── config
│ └── anymal_c
│ ├── agent # <- this is where we store the learning agent configurations
│ ├── __init__.py # <- this is where we register the environment and configurations to gym registry
│ ├── flat_env_cfg.py
│ └── rough_env_cfg.py
├── __init__.py
├── velocity_cfg.py
└── velocity_env.py
└── velocity_env_cfg.py # <- this is the base task configuration
```
The environments are then registered in the `omni/isaac/orbit_tasks/__init__.py`:
The environments are then registered in the `omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_c/__init__.py`:
```python
gym.register(
id="Isaac-Velocity-Anymal-C-v0",
entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.anymal_c.flat_terrain_cfg:FlatTerrainCfg"},
id="Isaac-Velocity-Rough-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={"env_cfg_entry_point": f"{__name__}.rough_env_cfg:AnymalCRoughEnvCfg"},
)
gym.register(
id="Isaac-Velocity-A1-v0",
entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv",
kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.a1.flat_terrain_cfg:FlatTerrainCfg"},
id="Isaac-Velocity-Flat-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={"env_cfg_entry_point": f"{__name__}.flat_env_cfg:AnymalCFlatEnvCfg"},
)
```
......
......@@ -9,7 +9,7 @@
We use OpenAI Gym registry to register the environment and their default configuration file.
The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
The string is parsed into respective configuration container which needs to be passed to the environment
class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
:mod:`omni.isaac.orbit.utils.parse_cfg`.
Note:
......@@ -18,12 +18,12 @@ Note:
the kwarg argument :obj:`cfg` while creating the environment.
Usage:
>>> import gym
>>> import gymnasium as gym
>>> import omni.isaac.orbit_tasks
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
>>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
>>>
>>> task_name = "Isaac-Cartpole-v0"
>>> cfg = load_default_env_cfg(task_name)
>>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
>>> env = gym.make(task_name, cfg=cfg)
"""
......
......@@ -7,7 +7,7 @@
Ant locomotion environment (similar to OpenAI Gym Ant-v2).
"""
import gym
import gymnasium as gym
from . import agents
......
......@@ -5,7 +5,7 @@
from __future__ import annotations
import gym.spaces
import gymnasium as gym
import math
import torch
......
......@@ -7,7 +7,7 @@
Cartpole balancing environment.
"""
import gym
import gymnasium as gym
from . import agents
......
......@@ -5,7 +5,7 @@
from __future__ import annotations
import gym.spaces
import gymnasium as gym
import math
import torch
......
......@@ -7,7 +7,7 @@
Humanoid locomotion environment (similar to OpenAI Gym Humanoid-v2).
"""
import gym
import gymnasium as gym
from . import agents
......
......@@ -5,7 +5,7 @@
from __future__ import annotations
import gym.spaces
import gymnasium as gym
import math
import torch
......
......@@ -3,7 +3,7 @@
#
# SPDX-License-Identifier: BSD-3-Clause
import gym
import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg
......@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register(
id="Isaac-Velocity-Flat-Anymal-B-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
......@@ -23,6 +24,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Flat-Anymal-B-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
......@@ -32,6 +34,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-B-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,
......@@ -41,6 +44,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-B-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,
......
......@@ -3,7 +3,7 @@
#
# SPDX-License-Identifier: BSD-3-Clause
import gym
import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg
......@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register(
id="Isaac-Velocity-Flat-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
......@@ -24,6 +25,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Flat-Anymal-C-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
......@@ -33,6 +35,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-C-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,
......@@ -42,6 +45,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-C-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,
......
......@@ -3,7 +3,7 @@
#
# SPDX-License-Identifier: BSD-3-Clause
import gym
import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg
......@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register(
id="Isaac-Velocity-Flat-Anymal-D-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
......@@ -23,6 +24,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Flat-Anymal-D-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
......@@ -32,6 +34,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-D-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,
......@@ -41,6 +44,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Anymal-D-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,
......
......@@ -3,7 +3,7 @@
#
# SPDX-License-Identifier: BSD-3-Clause
import gym
import gymnasium as gym
from . import agents, flat_env_cfg, rough_env_cfg
......@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
gym.register(
id="Isaac-Velocity-Flat-Unitree-A1-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
......@@ -23,6 +24,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Flat-Unitree-A1-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
......@@ -32,6 +34,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Unitree-A1-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,
......@@ -41,6 +44,7 @@ gym.register(
gym.register(
id="Isaac-Velocity-Rough-Unitree-A1-Play-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg_PLAY,
"rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,
......
......@@ -65,7 +65,7 @@ class MySceneCfg(InteractiveSceneCfg):
offset=RayCasterCfg.OffsetCfg(pos=(0.0, 0.0, 20.0)),
attach_yaw_only=True,
pattern_cfg=patterns.GridPatternCfg(resolution=0.1, size=[1.6, 1.0]),
debug_vis=True,
debug_vis=False,
mesh_prim_paths=["/World/ground"],
)
contact_forces = ContactSensorCfg(prim_path="{ENV_REGEX_NS}/Robot/.*", history_length=3, track_air_time=True)
......
......@@ -7,7 +7,7 @@
Environment for lifting an object with fixed-base robot.
"""
import gym
import gymnasium as gym
from . import agents
......@@ -18,6 +18,7 @@ from . import agents
gym.register(
id="Isaac-Lift-Franka-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.lift_env_cfg:LiftEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
......
......@@ -5,7 +5,7 @@
from __future__ import annotations
import gym.spaces
import gymnasium as gym
import math
import torch
......
......@@ -5,7 +5,7 @@
"""Environment for end-effector pose tracking task for fixed-arm robots."""
import gym
import gymnasium as gym
from . import agents
......@@ -16,6 +16,7 @@ from . import agents
gym.register(
id="Isaac-Reach-Franka-v0",
entry_point="omni.isaac.orbit.envs:RLTaskEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.reach_env_cfg:ReachEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
......
......@@ -5,7 +5,7 @@
from __future__ import annotations
import gym.spaces
import gymnasium as gym
import math
import torch
......
......@@ -7,7 +7,7 @@
from __future__ import annotations
import gym
import gymnasium as gym
import importlib
import inspect
import os
......@@ -52,7 +52,7 @@ def load_cfg_from_registry(task_name: str, entry_point_key: str) -> dict | Any:
ValueError: If the entry point key is not available in the gym registry for the task.
"""
# obtain the configuration entry point
cfg_entry_point = gym.spec(task_name)._kwargs.pop(entry_point_key)
cfg_entry_point = gym.spec(task_name).kwargs.pop(entry_point_key)
# check if entry point exists
if cfg_entry_point is None:
raise ValueError(
......
......@@ -33,7 +33,7 @@ for RL-Games :class:`Runner` class:
from __future__ import annotations
import gym
import gymnasium as gym
import torch
from rl_games.common import env_configurations
......@@ -49,10 +49,10 @@ Vectorized environment wrapper.
"""
class RlGamesVecEnvWrapper(gym.Wrapper):
"""Wraps around Isaac Orbit environment for RL-Games.
class RlGamesVecEnvWrapper(IVecEnv):
"""Wraps around Orbit environment for RL-Games.
This class wraps around the Isaac Orbit environment. Since RL-Games works directly on
This class wraps around the Orbit environment. Since RL-Games works directly on
GPU buffers, the wrapper handles moving of buffers from the simulation environment
to the same device as the learning agent. Additionally, it performs clipping of
observations and actions.
......@@ -69,6 +69,13 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
checks if these attributes exist. If they don't then the wrapper defaults to zero as number
of privileged observations.
.. caution::
This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
wrapper.
Reference:
https://github.com/Denys88/rl_games/blob/master/rl_games/common/ivecenv.py
https://github.com/NVIDIA-Omniverse/IsaacGymEnvs
......@@ -85,30 +92,77 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
Raises:
ValueError: The environment is not inherited from :class:`RLTaskEnv`.
ValueError: If specified, the privileged observations (critic) are not of type :obj:`gym.spaces.Box`.
"""
# check that input is valid
if not isinstance(env.unwrapped, RLTaskEnv):
raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
# initialize gym wrapper
gym.Wrapper.__init__(self, env)
# initialize rl-games vec-env
IVecEnv.__init__(self)
# initialize the wrapper
self.env = env
# store provided arguments
self._rl_device = rl_device
self._clip_obs = clip_obs
self._clip_actions = clip_actions
self._sim_device = env.unwrapped.device
# information about spaces for the wrapper
self.observation_space = self.env.observation_space
self.action_space = self.env.action_space
# note: rl-games only wants single observation and action spaces
self.rlg_observation_space = self.unwrapped.single_observation_space["policy"]
self.rlg_action_space = self.unwrapped.single_action_space
# information for privileged observations
self.state_space = getattr(self.env, "state_space", None)
self.num_states = getattr(self.env, "num_states", 0)
# print information about wrapper
print("[INFO]: RL-Games Environment Wrapper:")
print(f"\t\t Observations clipping: {clip_obs}")
print(f"\t\t Actions clipping : {clip_actions}")
print(f"\t\t Agent device : {rl_device}")
print(f"\t\t Asymmetric-learning : {self.num_states != 0}")
self.rlg_state_space = self.unwrapped.single_observation_space.get("critic")
if self.rlg_state_space is not None:
if not isinstance(self.rlg_state_space, gym.spaces.Box):
raise ValueError(f"Privileged observations must be of type Box. Type: {type(self.rlg_state_space)}")
self.rlg_num_states = self.rlg_state_space.shape[0]
else:
self.rlg_num_states = 0
def __str__(self):
"""Returns the wrapper name and the :attr:`env` representation string."""
return (
f"<{type(self).__name__}{self.env}>"
f"\n\tObservations clipping: {self._clip_obs}"
f"\n\tActions clipping : {self._clip_actions}"
f"\n\tAgent device : {self._rl_device}"
f"\n\tAsymmetric-learning : {self.rlg_num_states != 0}"
)
def __repr__(self):
"""Returns the string representation of the wrapper."""
return str(self)
"""
Properties -- Gym.Wrapper
"""
@property
def render_mode(self) -> str | None:
"""Returns the :attr:`Env` :attr:`render_mode`."""
return self.env.render_mode
@property
def observation_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`observation_space`."""
return self.env.observation_space
@property
def action_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`action_space`."""
return self.env.action_space
@classmethod
def class_name(cls) -> str:
"""Returns the class name of the wrapper."""
return cls.__name__
@property
def unwrapped(self) -> RLTaskEnv:
"""Returns the base environment of the wrapper.
This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
"""
return self.env.unwrapped
"""
Properties
......@@ -120,40 +174,46 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
def get_env_info(self) -> dict:
"""Returns the Gym spaces for the environment."""
# fill the env info dict
env_info = {"observation_space": self.observation_space, "action_space": self.action_space}
# add information about privileged observations space
if self.num_states > 0:
env_info["state_space"] = self.state_space
return env_info
return {
"observation_space": self.rlg_observation_space,
"action_space": self.rlg_action_space,
"state_space": self.rlg_state_space,
}
"""
Operations - MDP
"""
def seed(self, seed: int = -1) -> int: # noqa: D102
return self.unwrapped.seed(seed)
def reset(self): # noqa: D102
obs_dict = self.env.reset()
obs_dict, _ = self.env.reset()
# process observations and states
return self._process_obs(obs_dict)
def step(self, actions): # noqa: D102
# move actions to sim-device
actions = actions.detach().clone().to(device=self._sim_device)
# clip the actions
actions = torch.clamp(actions.clone(), -self._clip_actions, self._clip_actions)
actions = torch.clamp(actions, -self._clip_actions, self._clip_actions)
# perform environment step
obs_dict, rew, dones, extras = self.env.step(actions)
obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
# process observations and states
obs_and_states = self._process_obs(obs_dict)
# move buffers to rl-device
# note: we perform clone to prevent issues when rl-device and sim-device are the same.
rew = rew.to(self._rl_device)
dones = dones.to(self._rl_device)
rew = rew.to(device=self._rl_device)
dones = (terminated | truncated).to(device=self._rl_device)
extras = {
k: v.to(device=self._rl_device, non_blocking=True) if hasattr(v, "to") else v for k, v in extras.items()
}
return obs_and_states, rew, dones, extras
def close(self): # noqa: D102
return self.env.close()
"""
Helper functions
"""
......@@ -163,34 +223,29 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
Note:
States typically refers to privileged observations for the critic function. It is typically used in
asymmetric actor-critic algorithms [1].
asymmetric actor-critic algorithms.
Args:
obs: The current observations from environment.
obs_dict: The current observations from environment.
Returns:
If environment provides states, then a dictionary
containing the observations and states is returned. Otherwise just the observations tensor
is returned.
Reference:
1. Pinto, Lerrel, et al. "Asymmetric actor critic for image-based robot learning."
arXiv preprint arXiv:1710.06542 (2017).
If environment provides states, then a dictionary containing the observations and states is returned.
Otherwise just the observations tensor is returned.
"""
# process policy obs
obs = obs_dict["policy"]
# clip the observations
obs = torch.clamp(obs, -self._clip_obs, self._clip_obs)
# move the buffer to rl-device
obs = obs.to(self._rl_device).clone()
obs = obs.to(device=self._rl_device).clone()
# check if asymmetric actor-critic or not
if self.num_states > 0:
if self.rlg_num_states > 0:
# acquire states from the environment if it exists
try:
states = obs_dict["critic"]
except AttributeError:
raise NotImplementedError("Environment does not define key `critic` for privileged observations.")
raise NotImplementedError("Environment does not define key 'critic' for privileged observations.")
# clip the states
states = torch.clamp(states, -self._clip_obs, self._clip_obs)
# move buffers to rl-device
......
......@@ -17,22 +17,28 @@ The following example shows how to wrap an environment for RSL-RL:
from __future__ import annotations
import gym
import gym.spaces
import gymnasium as gym
import torch
from rsl_rl.env import VecEnv
from omni.isaac.orbit.envs import RLTaskEnv
class RslRlVecEnvWrapper(gym.Wrapper):
"""Wraps around Isaac Orbit environment for RSL-RL library
class RslRlVecEnvWrapper(VecEnv):
"""Wraps around Orbit environment for RSL-RL library
To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_privileged_obs` (int).
This is used by the learning agent to allocate buffers in the trajectory memory. Additionally, the returned
observations should have the key "critic" which corresponds to the privileged observations. Since this is
optional for some environments, the wrapper checks if these attributes exist. If they don't then the wrapper
defaults to zero as number of privileged observations.
.. caution::
To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_states` (int)
and :attr:`state_space` (:obj:`gym.spaces.Box`). These are used by the learning agent to allocate buffers in
the trajectory memory. Additionally, the method :meth:`_get_observations()` should have the key "critic"
which corresponds to the privileged observations. Since this is optional for some environments, the wrapper
checks if these attributes exist. If they don't then the wrapper defaults to zero as number of privileged
observations.
This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
wrapper.
Reference:
https://github.com/leggedrobotics/rsl_rl/blob/master/rsl_rl/env/vec_env.py
......@@ -41,6 +47,9 @@ class RslRlVecEnvWrapper(gym.Wrapper):
def __init__(self, env: RLTaskEnv):
"""Initializes the wrapper.
Note:
The wrapper calls :meth:`reset` at the start since the RSL-RL runner does not call reset.
Args:
env: The environment to wrap around.
......@@ -51,28 +60,74 @@ class RslRlVecEnvWrapper(gym.Wrapper):
if not isinstance(env.unwrapped, RLTaskEnv):
raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
# initialize the wrapper
gym.Wrapper.__init__(self, env)
self.env = env
# store information required by wrapper
orbit_env: RLTaskEnv = self.env.unwrapped
self.num_envs = orbit_env.num_envs
self.num_actions = orbit_env.action_manager.total_action_dim
self.num_obs = orbit_env.observation_manager.group_obs_dim["policy"][0]
self.num_envs = self.unwrapped.num_envs
self.device = self.unwrapped.device
self.max_episode_length = self.unwrapped.max_episode_length
self.num_actions = self.unwrapped.action_manager.total_action_dim
self.num_obs = self.unwrapped.observation_manager.group_obs_dim["policy"][0]
# -- privileged observations
if "critic" in self.unwrapped.observation_manager.group_obs_dim:
self.num_privileged_obs = self.unwrapped.observation_manager.group_obs_dim["critic"][0]
else:
self.num_privileged_obs = 0
# reset at the start since the RSL-RL runner does not call reset
self.env.reset()
def __str__(self):
"""Returns the wrapper name and the :attr:`env` representation string."""
return f"<{type(self).__name__}{self.env}>"
def __repr__(self):
"""Returns the string representation of the wrapper."""
return str(self)
"""
Properties -- Gym.Wrapper
"""
@property
def render_mode(self) -> str | None:
"""Returns the :attr:`Env` :attr:`render_mode`."""
return self.env.render_mode
@property
def observation_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`observation_space`."""
return self.env.observation_space
@property
def action_space(self) -> gym.Space:
"""Returns the :attr:`Env` :attr:`action_space`."""
return self.env.action_space
@classmethod
def class_name(cls) -> str:
"""Returns the class name of the wrapper."""
return cls.__name__
@property
def unwrapped(self) -> RLTaskEnv:
"""Returns the base environment of the wrapper.
This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
"""
return self.env.unwrapped
"""
Properties
"""
def get_observations(self) -> torch.Tensor:
def get_observations(self) -> tuple[torch.Tensor, dict]:
"""Returns the current observations of the environment."""
obs_dict = self.env.unwrapped.observation_manager.compute()
obs_dict = self.unwrapped.observation_manager.compute()
return obs_dict["policy"], {"observations": obs_dict}
@property
def episode_length_buf(self) -> torch.Tensor:
"""The episode length buffer."""
return self.env.unwrapped.episode_length_buf
return self.unwrapped.episode_length_buf
@episode_length_buf.setter
def episode_length_buf(self, value: torch.Tensor):
......@@ -80,22 +135,34 @@ class RslRlVecEnvWrapper(gym.Wrapper):
Note: This is needed to perform random initialization of episode lengths in RSL-RL.
"""
self.env.unwrapped.episode_length_buf = value
self.unwrapped.episode_length_buf = value
"""
Operations - MDP
"""
def reset(self) -> tuple[torch.Tensor, dict]:
def seed(self, seed: int = -1) -> int: # noqa: D102
return self.unwrapped.seed(seed)
def reset(self) -> tuple[torch.Tensor, dict]: # noqa: D102
# reset the environment
obs_dict = self.env.reset()
obs_dict, _ = self.env.reset()
# return observations
return obs_dict["policy"], {"observations": obs_dict}
def step(self, actions: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, dict]:
# record step information
obs_dict, rew, dones, extras = self.env.step(actions)
# return step information
obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
# compute dones for compatibility with RSL-RL
dones = (terminated | truncated).to(dtype=torch.long)
# move extra observations to the extras dict
obs = obs_dict["policy"]
extras["observations"] = obs_dict
# move time out information to the extras dict
extras["time_outs"] = truncated
# return the step information
return obs, rew, dones, extras
def close(self): # noqa: D102
return self.env.close()
......@@ -93,9 +93,9 @@ Vectorized environment wrapper.
def SkrlVecEnvWrapper(env: RLTaskEnv):
"""Wraps around Isaac Orbit environment for skrl.
"""Wraps around Orbit environment for skrl.
This function wraps around the Isaac Orbit environment. Since the :class:`RLTaskEnv` environment
This function wraps around the Orbit environment. Since the :class:`RLTaskEnv` environment
wrapping functionality is defined within the skrl library itself, this implementation
is maintained for compatibility with the structure of the extension that contains it.
Internally it calls the :func:`wrap_env` from the skrl library API.
......
......@@ -22,18 +22,21 @@ INSTALL_REQUIRES = [
"numpy",
"torch",
"torchvision>=0.14.1", # ensure compatibility with torch 1.13.1
"protobuf==3.20.2",
"protobuf>=3.20.2",
# data collection
"h5py",
# basic logger
"tensorboard",
# video recording
"moviepy",
]
# Extra dependencies for RL agents
EXTRAS_REQUIRE = {
"sb3": ["stable-baselines3>=1.5,<=1.8", "tensorboard"],
"sb3": ["stable-baselines3>=2.0"],
"skrl": ["skrl>=0.10.0"],
"rl_games": ["rl-games==1.5.2"],
# TODO: Uncomment when rsl_rl is updated to public.
# "rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
"rl_games": ["rl-games==1.6.1"],
"rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
"robomimic": ["robomimic@git+https://github.com/ARISE-Initiative/robomimic.git"],
}
# cumulation of all extra-requires
......@@ -43,7 +46,7 @@ EXTRAS_REQUIRE["all"] = list(itertools.chain.from_iterable(EXTRAS_REQUIRE.values
# Installation operation
setup(
name="omni-isaac-orbit_tasks",
author="NVIDIA, ETH Zurich, and University of Toronto",
author="ORBIT Project Developers",
maintainer="Mayank Mittal",
maintainer_email="mittalma@ethz.ch",
url=EXTENSION_TOML_DATA["package"]["repository"],
......@@ -55,6 +58,10 @@ setup(
install_requires=INSTALL_REQUIRES,
extras_require=EXTRAS_REQUIRE,
packages=["omni.isaac.orbit_tasks"],
classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
classifiers=[
"Natural Language :: English",
"Programming Language :: Python :: 3.10",
"Isaac Sim :: 2023.1.0-hotfix.1",
],
zip_safe=False,
)
......@@ -20,8 +20,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gym.envs
import gymnasium as gym
import torch
import traceback
import unittest
......@@ -42,7 +41,7 @@ class TestEnvironments(unittest.TestCase):
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.envs.registry.all():
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
......@@ -70,19 +69,20 @@ class TestEnvironments(unittest.TestCase):
env: RLTaskEnv = gym.make(task_name, cfg=env_cfg)
# reset environment
obs = env.reset()
obs, _ = env.reset()
# check signal
self.assertTrue(self._check_valid_tensor(obs))
# simulate environment for 1000 steps
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
with torch.inference_mode():
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
......@@ -108,9 +108,9 @@ class TestEnvironments(unittest.TestCase):
valid_tensor = True
for value in data.values():
if isinstance(value, dict):
return TestEnvironments._check_valid_tensor(value)
valid_tensor &= TestEnvironments._check_valid_tensor(value)
elif isinstance(value, torch.Tensor):
valid_tensor = valid_tensor and not torch.any(torch.isnan(value))
valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
......
......@@ -19,7 +19,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import os
import torch
import traceback
......@@ -42,7 +42,7 @@ class TestRecordVideoWrapper(unittest.TestCase):
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.envs.registry.all():
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
......@@ -73,25 +73,24 @@ class TestRecordVideoWrapper(unittest.TestCase):
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env: RLTaskEnv = gym.make(task_name, cfg=env_cfg)
env: RLTaskEnv = gym.make(task_name, cfg=env_cfg, render_mode="rgb_array")
# directory to save videos
videos_dir = os.path.join(self.videos_dir, task_name)
# wrap environment to record videos
env = gym.wrappers.RecordVideo(
env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length
env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length, disable_logger=True
)
# reset environment
env.reset()
# simulate environment
for _ in range(500):
# compute zero actions
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
# apply actions
_ = env.step(actions)
# render environment
env.render(mode="human")
with torch.inference_mode():
for _ in range(500):
# compute zero actions
actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions
_ = env.step(actions)
# close the simulator
env.close()
......
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.rl_games import RlGamesVecEnvWrapper
class TestRlGamesVecEnvWrapper(unittest.TestCase):
"""Test that RL-Games VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = RlGamesVecEnvWrapper(env, "cuda:0", 100, 100)
# reset environment
obs = env.reset()
# check signal
self.assertTrue(self._check_valid_tensor(obs))
# simulate environment for 100 steps
with torch.inference_mode():
for _ in range(100):
# sample actions from -1 to 1
actions = 2 * torch.rand(env.action_space.shape, device=env.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, torch.Tensor):
return not torch.any(torch.isnan(data))
elif isinstance(data, dict):
valid_tensor = True
for value in data.values():
if isinstance(value, dict):
valid_tensor &= TestRlGamesVecEnvWrapper._check_valid_tensor(value)
elif isinstance(value, torch.Tensor):
valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.rsl_rl import RslRlVecEnvWrapper
class TestRslRlVecEnvWrapper(unittest.TestCase):
"""Test that RSL-RL VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = RslRlVecEnvWrapper(env)
# reset environment
obs, extras = env.reset()
# check signal
self.assertTrue(self._check_valid_tensor(obs))
self.assertTrue(self._check_valid_tensor(extras))
# simulate environment for 1000 steps
with torch.inference_mode():
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, torch.Tensor):
return not torch.any(torch.isnan(data))
elif isinstance(data, dict):
valid_tensor = True
for value in data.values():
if isinstance(value, dict):
valid_tensor &= TestRslRlVecEnvWrapper._check_valid_tensor(value)
elif isinstance(value, torch.Tensor):
valid_tensor &= not torch.any(torch.isnan(value))
return valid_tensor
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
# Copyright (c) 2022-2023, The ORBIT Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
"""Launch Isaac Sim Simulator first."""
import os
from omni.isaac.orbit.app import AppLauncher
# launch the simulator
app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
app_launcher = AppLauncher(headless=True, experience=app_experience)
simulation_app = app_launcher.app
"""Rest everything follows."""
import gymnasium as gym
import numpy as np
import torch
import traceback
import unittest
import carb
import omni.usd
from omni.isaac.orbit.envs import RLTaskEnvCfg
import omni.isaac.orbit_tasks # noqa: F401
from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
from omni.isaac.orbit_tasks.utils.wrappers.sb3 import Sb3VecEnvWrapper
class TestStableBaselines3VecEnvWrapper(unittest.TestCase):
"""Test that RSL-RL VecEnv wrapper works as expected."""
@classmethod
def setUpClass(cls):
# acquire all Isaac environments names
cls.registered_tasks = list()
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
cls.registered_tasks.append(task_spec.id)
# sort environments by name
cls.registered_tasks.sort()
# only pick the first three environments to test
cls.registered_tasks = cls.registered_tasks[:3]
# print all existing task names
print(">>> All registered environments:", cls.registered_tasks)
def setUp(self) -> None:
# common parameters
self.num_envs = 512
self.use_gpu = True
def test_random_actions(self):
"""Run random actions and check environments return valid signals."""
for task_name in self.registered_tasks:
print(f">>> Running test for environment: {task_name}")
# create a new stage
omni.usd.get_context().new_stage()
# parse configuration
env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
# note: we don't want to shutdown the app on stop during the tests since we reload the stage
env_cfg.sim.shutdown_app_on_stop = False
# create environment
env = gym.make(task_name, cfg=env_cfg)
# wrap environment
env = Sb3VecEnvWrapper(env)
# reset environment
obs = env.reset()
# check signal
self.assertTrue(self._check_valid_array(obs))
# simulate environment for 1000 steps
with torch.inference_mode():
for _ in range(1000):
# sample actions from -1 to 1
actions = 2 * np.random.rand(env.num_envs, env.action_space.shape) - 1
# apply actions
transition = env.step(actions)
# check signals
for data in transition:
self.assertTrue(self._check_valid_array(data), msg=f"Invalid data: {data}")
# close the environment
print(f">>> Closing environment: {task_name}")
env.close()
"""
Helper functions.
"""
@staticmethod
def _check_valid_array(data: np.ndarray | dict | list) -> bool:
"""Checks if given data does not have corrupted values.
Args:
data: Data buffer.
Returns:
True if the data is valid.
"""
if isinstance(data, np.ndarray):
return not np.any(np.isnan(data))
elif isinstance(data, dict):
valid_array = True
for value in data.values():
if isinstance(value, dict):
valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
elif isinstance(value, np.ndarray):
valid_array &= not np.any(np.isnan(value))
return valid_array
elif isinstance(data, list):
valid_array = True
for value in data:
valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
return valid_array
else:
raise ValueError(f"Input data of invalid type: {type(data)}.")
if __name__ == "__main__":
try:
unittest.main()
except Exception as err:
carb.log_error(err)
carb.log_error(traceback.format_exc())
raise
finally:
# close sim app
simulation_app.close()
......@@ -27,7 +27,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
from prettytable import PrettyTable
import omni.isaac.contrib_tasks # noqa: F401
......@@ -47,10 +47,10 @@ def main():
# count of environments
index = 0
# acquire all Isaac environments names
for task_spec in gym.envs.registry.all():
for task_spec in gym.registry.values():
if "Isaac" in task_spec.id:
# add details to table
table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec._kwargs["env_cfg_entry_point"]])
table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec.kwargs["env_cfg_entry_point"]])
# increment count
index += 1
......@@ -61,6 +61,8 @@ if __name__ == "__main__":
try:
# run the main function
main()
except Exception as e:
raise e
finally:
# close the app
simulation_app.close()
......@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Random agent for Isaac Orbit environments.")
parser = argparse.ArgumentParser(description="Random agent for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
......@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......@@ -43,12 +43,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main():
"""Random actions agent with Isaac Orbit environment."""
"""Random actions agent with Orbit environment."""
# parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
# create environment
env = gym.make(args_cli.task, cfg=env_cfg)
# print info (this is vectorized environment)
print(f"[INFO]: Gym observation space: {env.observation_space}")
print(f"[INFO]: Gym action space: {env.action_space}")
# reset environment
env.reset()
# simulate environment
......@@ -56,9 +59,9 @@ def main():
# run everything in inference mode
with torch.inference_mode():
# sample actions from -1 to 1
actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
# apply actions
_, _, _, _ = env.step(actions)
env.step(actions)
# close the simulator
env.close()
......
......@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
"""Rest everything else."""
import gym
import gymnasium as gym
import torch
import traceback
from enum import Enum
......
......@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Keyboard teleoperation for Isaac Orbit environments.")
parser = argparse.ArgumentParser(description="Keyboard teleoperation for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
parser.add_argument("--device", type=str, default="keyboard", help="Device for interacting with environment")
......@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......
......@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Zero agent for Isaac Orbit environments.")
parser = argparse.ArgumentParser(description="Zero agent for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
......@@ -30,7 +30,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......@@ -42,12 +42,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main():
"""Zero actions agent with Isaac Orbit environment."""
"""Zero actions agent with Orbit environment."""
# parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
# create environment
env = gym.make(args_cli.task, cfg=env_cfg)
# print info (this is vectorized environment)
print(f"[INFO]: Gym observation space: {env.observation_space}")
print(f"[INFO]: Gym action space: {env.action_space}")
# reset environment
env.reset()
# simulate environment
......@@ -55,9 +58,9 @@ def main():
# run everything in inference mode
with torch.inference_mode():
# compute zero actions
actions = torch.zeros((env.num_envs, env.action_space.shape[0]), device=env.device)
actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
# apply actions
_, _, _, _ = env.step(actions)
env.step(actions)
# close the simulator
env.close()
......
......@@ -37,7 +37,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import math
import os
import torch
......
......@@ -41,7 +41,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import math
import os
import traceback
......@@ -96,13 +96,14 @@ def main():
clip_actions = agent_cfg["params"]["env"].get("clip_actions", math.inf)
# create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg)
env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording
if args_cli.video:
video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length,
"disable_logger": True,
}
print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4)
......
......@@ -3,7 +3,7 @@
#
# SPDX-License-Identifier: BSD-3-Clause
"""Script to collect demonstrations with Isaac Orbit environments."""
"""Script to collect demonstrations with Orbit environments."""
from __future__ import annotations
......@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Collect demonstrations for Isaac Orbit environments.")
parser = argparse.ArgumentParser(description="Collect demonstrations for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
......@@ -35,7 +35,7 @@ simulation_app = app_launcher.app
import contextlib
import gym
import gymnasium as gym
import os
import torch
import traceback
......
......@@ -15,7 +15,7 @@ import argparse
from omni.isaac.orbit.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Isaac Orbit environments.")
parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Orbit environments.")
parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
parser.add_argument("--task", type=str, default=None, help="Name of the task.")
parser.add_argument("--checkpoint", type=str, default=None, help="Pytorch model checkpoint to load.")
......@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......@@ -46,7 +46,7 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
def main():
"""Run a trained policy from robomimic with Isaac Orbit environment."""
"""Run a trained policy from robomimic with Orbit environment."""
# parse configuration
env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=1)
# modify configuration
......
......@@ -54,7 +54,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import argparse
import gym
import gymnasium as gym
import json
import numpy as np
import os
......
......@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import os
import torch
import traceback
......
......@@ -47,7 +47,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import os
import torch
import traceback
......@@ -88,13 +88,14 @@ def main():
log_dir = os.path.join(log_root_path, log_dir)
# create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg)
env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording
if args_cli.video:
video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length,
"disable_logger": True,
}
print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4)
......
......@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......
......@@ -43,7 +43,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import os
import traceback
from datetime import datetime
......@@ -95,6 +95,7 @@ def main():
"video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length,
"disable_logger": True,
}
print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4)
......
......@@ -38,7 +38,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import torch
import traceback
......
......@@ -48,7 +48,7 @@ simulation_app = app_launcher.app
"""Rest everything follows."""
import gym
import gymnasium as gym
import traceback
from datetime import datetime
......@@ -97,13 +97,14 @@ def main():
dump_pickle(os.path.join(log_dir, "params", "agent.pkl"), experiment_cfg)
# create isaac environment
env = gym.make(args_cli.task, cfg=env_cfg)
env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
# wrap for video recording
if args_cli.video:
video_kwargs = {
"video_folder": os.path.join(log_dir, "videos"),
"step_trigger": lambda step: step % args_cli.video_interval == 0,
"video_length": args_cli.video_length,
"disable_logger": True,
}
print("[INFO] Recording videos during training.")
print_dict(video_kwargs, nesting=4)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment