Upgrades environments from Gym 0.21 to Gymnasium 0.29 (#234)

# Description Currently, we are downgrading many libraries to be able to use the Gym 0.21.0 version. However, this is not great and is causing issues installing new Python packages, as highlighted in #204. It is becoming a more significant issue with Python 3.10 in Isaac Sim 2023.1. This MR upgrades the repository to use the Gymnasium Environment class. ## Type of Change - Bug fix (non-breaking change which fixes an issue) - Breaking change (fix or feature that would cause existing functionality to not work as expected) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./orbit.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Mayank Mittal <12863862+Mayankm96@users.noreply.github.com> Co-authored-by: David Hoeller <dhoeller@ethz.ch>

Upgrades environments from Gym 0.21 to Gymnasium 0.29 (#234)
# Description Currently, we are downgrading many libraries to be able to use the Gym 0.21.0 version. However, this is not great and is causing issues installing new Python packages, as highlighted in #204. It is becoming a more significant issue with Python 3.10 in Isaac Sim 2023.1. This MR upgrades the repository to use the Gymnasium Environment class. ## Type of Change - Bug fix (non-breaking change which fixes an issue) - Breaking change (fix or feature that would cause existing functionality to not work as expected) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./orbit.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Mayank Mittal <12863862+Mayankm96@users.noreply.github.com> Co-authored-by: David Hoeller <dhoeller@ethz.ch>
cd2c4f1d · Mayank Mittal · GitHub · e5b43e96 · cd2c4f1d · cd2c4f1d
Unverified Commit cd2c4f1d authored Nov 07, 2023 by Mayank Mittal Committed by GitHub Nov 07, 2023
65 changed files
--- a/docs/source/api/orbit_tasks.isaac_env.rst
+++ b/docs/source/api/orbit_tasks.isaac_env.rst
@@ -4,7 +4,7 @@ omni.isaac.orbit_tasks.isaac_env
 We use OpenAI Gym registry to register the environment and their default configuration file.
 The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
 The string is parsed into respective configuration container which needs to be passed to the environment
-class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
+class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
 :mod:`omni.isaac.orbit.utils.parse_cfg`.
@@ -17,12 +17,12 @@ class. This is done using the function :meth:`load_default_env_cfg` in the sub-m
 .. code-block:: python
-   import gym
+   import gymnasium as gym
   import omni.isaac.orbit_tasks
-   from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
+   from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
   task_name = "Isaac-Cartpole-v0"
-   cfg = load_default_env_cfg(task_name)
+   cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
   env = gym.make(task_name, cfg=cfg)

--- a/docs/source/refs/issues.rst
+++ b/docs/source/refs/issues.rst
 Known issues
 ============
-Installation errors due to gym==0.21.0
--------------------------------------
-When installing the gym package, you may encounter the following error:
-.. code-block::
-    error in gym setup command: 'extras_require' must be a dictionary whose values are strings or lists of
-    strings containing valid project/version requirement specifiers.
-    ----------------------------------------
-    ERROR: Could not find a version that satisfies the requirement gym==0.21.0 (from omni-isaac-orbit-envs[all])
-    (from versions: 0.0.2, 0.0.3, 0.0.4, 0.0.5, 0.0.6, 0.0.7, 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6,
-    ...
-    0.15.7, 0.16.0, 0.17.0, 0.17.1, 0.17.2, 0.17.3, 0.18.0, 0.18.3, 0.19.0, 0.20.0, 0.21.0, 0.22.0, 0.23.0,
-    0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.25.1, 0.25.2, 0.26.0, 0.26.1, 0.26.2)
-    ERROR: No matching distribution found for gym==0.21.0
-This issue arises since the ``setuptools`` package from version 67.0 onwards does not support malformed version strings.
-Since the OpenAI Gym package that is no longer being maintained (`issue link <https://github.com/openai/gym/issues/3200>`_),
-the current workaround is to install the ``setuptools`` package version 66.0.0. You can do this by running the following
-command:
-.. code-block:: bash
-    ./orbit.sh -p -m pip install -U setuptools==66
 Regression in Isaac Sim 2022.2.1
 --------------------------------

--- a/docs/source/setup/installation.rst
+++ b/docs/source/setup/installation.rst
@@ -157,7 +157,7 @@ utilities to manage extensions:
   optional arguments:
      -h, --help           Display the help content.
-      -i, --install        Install the extensions inside Isaac Orbit.
+      -i, --install        Install the extensions inside Orbit.
      -e, --extra          Install extra dependencies such as the learning frameworks.
      -f, --format         Run pre-commit to format the code and check lints.
      -p, --python         Run the python executable (python.sh) provided by Isaac Sim.

--- a/docs/source/setup/sample.rst
+++ b/docs/source/setup/sample.rst
@@ -141,7 +141,7 @@ format.
   .. code:: bash
      # install python module (for robomimic)
-      ./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[robomimic]'
+      ./orbit.sh -e robomimic
      # split data
      ./orbit.sh -p source/standalone//workflows/robomimic/tools/split_train_val.py logs/robomimic/Isaac-Lift-Franka-v0/hdf_dataset.hdf5 --ratio 0.2
@@ -171,7 +171,7 @@ from the environments into the respective libraries function argument and return
   .. code:: bash
      # install python module (for stable-baselines3)
-      ./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[sb3]'
+      ./orbit.sh -e sb3
      # run script for training
      # note: we enable cpu flag since SB3 doesn't optimize for GPU anyway
      ./orbit.sh -p source/standalone/workflows/sb3/train.py --task Isaac-Cartpole-v0 --headless --cpu
@@ -184,7 +184,7 @@ from the environments into the respective libraries function argument and return
   .. code:: bash
      # install python module (for skrl)
-      ./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[skrl]'
+      ./orbit.sh -e skrl
      # run script for training
      ./orbit.sh -p source/standalone/workflows/skrl/train.py --task Isaac-Reach-Franka-v0 --headless
      # run script for playing with 32 environments
@@ -196,7 +196,7 @@ from the environments into the respective libraries function argument and return
   .. code:: bash
      # install python module (for rl-games)
-      ./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rl_games]'
+      ./orbit.sh -e rl_games
      # run script for training
      ./orbit.sh -p source/standalone/workflows/rl_games/train.py --task Isaac-Ant-v0 --headless
      # run script for playing with 32 environments
@@ -208,7 +208,7 @@ from the environments into the respective libraries function argument and return
   .. code:: bash
      # install python module (for rsl-rl)
-      ./orbit.sh -p -m pip install -e 'source/extensions/omni.isaac.orbit_tasks[rsl_rl]'
+      ./orbit.sh -e rsl_rl
      # run script for training
      ./orbit.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Reach-Franka-v0 --headless
      # run script for playing with 32 environments

--- a/docs/source/tutorials_envs/00_gym_env.rst
+++ b/docs/source/tutorials_envs/00_gym_env.rst
@@ -39,11 +39,12 @@ an environment by calling ``gym.make``. The environments are registered in the `
    gym.register(
        id="Isaac-Cartpole-v0",
        entry_point="omni.isaac.orbit_tasks.classic.cartpole:CartpoleEnv",
-        kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"},
+        disable_env_checker=True,
+        kwargs={"env_cfg_entry_point": "omni.isaac.orbit_tasks.classic.cartpole:cartpole_cfg.yaml"},
    )
-The ``cfg_entry_point`` argument is used to load the default configuration for the environment. The default
+The ``env_cfg_entry_point`` argument is used to load the default configuration for the environment. The default
-configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_default_env_cfg` function.
+configuration is loaded using the :meth:`omni.isaac.orbit_tasks.utils.parse_cfg.load_cfg_from_registry` function.
 The configuration entry point can correspond to both a YAML file or a python configuration
 class. The default configuration can be overridden by passing a custom configuration instance to the ``gym.make``
 function as shown later in the tutorial.

--- a/docs/source/tutorials_envs/02_wrappers.rst
+++ b/docs/source/tutorials_envs/02_wrappers.rst
@@ -26,13 +26,13 @@ For example, here is how you would wrap an environment to enforce that reset is
    """Rest everything follows."""
-    import gym
+    import gymnasium as gym
    import omni.isaac.orbit_tasks  # noqa: F401
-    from omni.isaac.orbit_tasks.utils import load_default_env_cfg
+    from omni.isaac.orbit_tasks.utils import load_cfg_from_registry
    # create base environment
-    cfg = load_default_env_cfg("Isaac-Reach-Franka-v0")
+    cfg = load_cfg_from_registry("Isaac-Reach-Franka-v0", "env_cfg_entry_point")
    env = gym.make("Isaac-Reach-Franka-v0", cfg=cfg)
    # wrap environment to enforce that reset is called before step
    env = gym.wrappers.OrderEnforcing(env)
@@ -105,7 +105,7 @@ for 200 steps, and saves it in the ``videos`` folder at a step interval of 1500
    """Rest everything follows."""
-    import gym
+    import gymnasium as gym
    # adjust camera resolution and pose
    env_cfg.viewer.resolution = (640, 480)

--- a/orbit.sh
+++ b/orbit.sh
@@ -185,7 +185,7 @@ print_help () {
    echo -e "\nusage: $(basename "$0") [-h] [-i] [-e] [-f] [-p] [-s] [-o] [-v] [-d] [-c] -- Utility to manage extensions in Orbit."
    echo -e "\noptional arguments:"
    echo -e "\t-h, --help           Display the help content."
-    echo -e "\t-i, --install        Install the extensions inside Isaac Orbit."
+    echo -e "\t-i, --install        Install the extensions inside Orbit."
    echo -e "\t-e, --extra          Install extra dependencies such as the learning frameworks."
    echo -e "\t-f, --format         Run pre-commit to format the code and check lints."
    echo -e "\t-p, --python         Run the python executable (python.sh) provided by Isaac Sim."
@@ -220,9 +220,6 @@ while [[ $# -gt 0 ]]; do
            # this does not check dependencies between extensions
            export -f extract_python_exe
            export -f install_orbit_extension
-            # downgrade setuptools to avoid issues with OpenAI Gym
-            # Check the `Known Issues` section in the documentation
-            $(extract_python_exe) -m pip install --upgrade setuptools==66
            # source directory
            find -L "${ORBIT_PATH}/source/extensions" -mindepth 1 -maxdepth 1 -type d -exec bash -c 'install_orbit_extension "{}"' \;
            # unset local variables
@@ -235,8 +232,17 @@ while [[ $# -gt 0 ]]; do
            # install the python packages for supported reinforcement learning frameworks
            echo "[INFO] Installing extra requirements such as learning frameworks..."
            python_exe=$(extract_python_exe)
+            # check if specified which rl-framework to install
+            if [ -z "$2" ]; then
+                echo "[INFO] Installing all rl-frameworks..."
+                framework_name="all"
+            else
+                echo "[INFO] Installing rl-framework: $2"
+                framework_name=$2
+                shift # past argument
+            fi
            # install the rl-frameworks specified
-            ${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks[all]
+            ${python_exe} -m pip install -e ${ORBIT_PATH}/source/extensions/omni.isaac.orbit_tasks["${framework_name}"]
            shift # past argument
            ;;
        -c|--conda)

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -27,7 +27,7 @@ extra_standard_library = [
    "tensordict",
    "bpy",
    "matplotlib",
-    "gym",
+    "gymnasium",
    "scipy",
    "hid",
    "yaml",

--- a/source/extensions/omni.isaac.contrib_tasks/docs/README.md
+++ b/source/extensions/omni.isaac.contrib_tasks/docs/README.md
@@ -18,9 +18,12 @@ itself. However, its various instances should be included in directories within
 The environments should then be registered in the `omni/isaac/contrib_tasks/__init__.py`:
 ```python
+import gymnasium as gym
 gym.register(
    id="Isaac-Contrib-<my-awesome-env>-v0",
    entry_point="omni.isaac.contrib_tasks.<your-env-package>:<your-env-class>",
+    disable_env_checker=True,
    kwargs={"cfg_entry_point": "omni.isaac.contrib_tasks.<your-env-package-cfg>:<your-env-class-cfg>"},
 )
 ```
--- a/source/extensions/omni.isaac.contrib_tasks/omni/isaac/contrib_tasks/__init__.py
+++ b/source/extensions/omni.isaac.contrib_tasks/omni/isaac/contrib_tasks/__init__.py
@@ -9,7 +9,7 @@
 We use OpenAI Gym registry to register the environment and their default configuration file.
 The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
 The string is parsed into respective configuration container which needs to be passed to the environment
-class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
+class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
 :mod:`omni.isaac.orbit.utils.parse_cfg`.
 Note:
@@ -18,18 +18,18 @@ Note:
    the kwarg argument :obj:`cfg` while creating the environment.
 Usage:
-    >>> import gym
+    >>> import gymnasium as gym
    >>> import omni.isaac.contrib_tasks
-    >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
+    >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
    >>>
    >>> task_name = "Isaac-Contrib-<my-registered-env-name>-v0"
-    >>> cfg = load_default_env_cfg(task_name)
+    >>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
    >>> env = gym.make(task_name, cfg=cfg)
 """
 from __future__ import annotations
-import gym  # noqa: F401
+import gymnasium as gym  # noqa: F401
 import os
 import toml

--- a/source/extensions/omni.isaac.contrib_tasks/setup.py
+++ b/source/extensions/omni.isaac.contrib_tasks/setup.py
@@ -28,6 +28,10 @@ setup(
    include_package_data=True,
    python_requires=">=3.7",
    packages=["omni.isaac.contrib_tasks"],
-    classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
+    classifiers=[
+        "Natural Language :: English",
+        "Programming Language :: Python :: 3.10",
+        "Isaac Sim :: 2023.1.0-hotfix.1",
+    ],
    zip_safe=False,
 )
--- a/source/extensions/omni.isaac.orbit/config/extension.toml
+++ b/source/extensions/omni.isaac.orbit/config/extension.toml
 [package]
 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.9.37"
+version = "0.9.38"
 # Description
 title = "ORBIT framework for Robot Learning"

--- a/source/extensions/omni.isaac.orbit/docs/CHANGELOG.rst
+++ b/source/extensions/omni.isaac.orbit/docs/CHANGELOG.rst
 Changelog
 ---------
+0.9.38 (2023-11-07)
+~~~~~~~~~~~~~~~~~~~
+Changed
+^^^^^^^
+* Upgraded the :class:`omni.isaac.orbit.envs.RLTaskEnv` class to support Gym 0.29.0 environment definition.
+Added
+^^^^^
+* Added computation of ``time_outs`` and ``terminated`` signals inside the termination manager. These follow the
+  definition mentioned in `Gym 0.29.0 <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_.
+* Added proper handling of observation and action spaces in the :class:`omni.isaac.orbit.envs.RLTaskEnv` class.
+  These now follow closely to how Gym VecEnv handles the spaces.
 0.9.37 (2023-11-06)
 ~~~~~~~~~~~~~~~~~~~

--- a/source/extensions/omni.isaac.orbit/omni/isaac/orbit/envs/rl_task_env.py
+++ b/source/extensions/omni.isaac.orbit/omni/isaac/orbit/envs/rl_task_env.py
--- a/source/extensions/omni.isaac.orbit/omni/isaac/orbit/managers/observation_manager.py
+++ b/source/extensions/omni.isaac.orbit/omni/isaac/orbit/managers/observation_manager.py
@@ -91,6 +91,11 @@ class ObservationManager(ManagerBase):
        """Shape of observation tensor for each term in each group."""
        return self._group_obs_term_dim
+    @property
+    def group_obs_concatenate(self) -> dict[str, bool]:
+        """Whether the observation terms are concatenated in each group."""
+        return self._group_obs_concatenate
    """
    Operations.
    """

--- a/source/extensions/omni.isaac.orbit/omni/isaac/orbit/managers/termination_manager.py
+++ b/source/extensions/omni.isaac.orbit/omni/isaac/orbit/managers/termination_manager.py
@@ -26,8 +26,20 @@ class TerminationManager(ManagerBase):
    argument and returns a boolean tensor of shape ``(num_envs,)``. The termination manager
    computes the termination signal as the union (logical or) of all the termination terms.
+    Following the `Gymnasium API <https://gymnasium.farama.org/tutorials/gymnasium_basics/handling_time_limits/>`_,
+    the termination signal is computed as the logical OR of the following signals:
+    * **Time-out**: This signal is set to true if the environment has ended after an externally defined condition
+      (that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
+      timed out (i.e. reached max episode length).
+    * **Terminated**: This signal is set to true if the environment has reached a terminal state defined by the
+      environment. This state may correspond to task success, task failure, robot falling, etc.
+    These signals can be individually accessed using the :attr:`time_outs` and :attr:`terminated` properties.
    The termination terms are parsed from a config class containing the manager's settings and each term's
-    parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class.
+    parameters. Each termination term should instantiate the :class:`TerminationTermCfg` class. The term's
+    configuration :attr:`TerminationTermCfg.time_out` decides whether the term is a timeout or a termination term.
    """
    _env: RLTaskEnv
@@ -46,8 +58,8 @@ class TerminationManager(ManagerBase):
        for term_name in self._term_names:
            self._episode_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
        # create buffer for managing termination per environment
-        self._done_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
+        self._truncated_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
-        self._time_out_buf = torch.zeros_like(self._done_buf)
+        self._terminated_buf = torch.zeros_like(self._truncated_buf)
    def __str__(self) -> str:
        """Returns: A string representation for termination manager."""
@@ -79,12 +91,26 @@ class TerminationManager(ManagerBase):
    @property
    def dones(self) -> torch.Tensor:
        """The net termination signal. Shape is ``(num_envs,)``."""
-        return self._done_buf
+        return self._truncated_buf | self._terminated_buf
    @property
    def time_outs(self) -> torch.Tensor:
-        """The timeout signal. Shape is ``(num_envs,)``."""
+        """The timeout signal (reaching max episode length). Shape is ``(num_envs,)``.
-        return self._time_out_buf
+        This signal is set to true if the environment has ended after an externally defined condition
+        (that is outside the scope of a MDP). For example, the environment may be terminated if the episode has
+        timed out (i.e. reached max episode length).
+        """
+        return self._truncated_buf
+    @property
+    def terminated(self) -> torch.Tensor:
+        """The terminated signal (reaching a terminal state). Shape is ``(num_envs,)``.
+        This signal is set to true if the environment has reached a terminal state defined by the environment.
+        This state may correspond to task success, task failure, robot falling, etc.
+        """
+        return self._terminated_buf
    """
    Operations.
@@ -122,20 +148,20 @@ class TerminationManager(ManagerBase):
            The combined termination signal of shape ``(num_envs,)``.
        """
        # reset computation
-        self._done_buf[:] = False
+        self._truncated_buf[:] = False
-        self._time_out_buf[:] = False
+        self._terminated_buf[:] = False
        # iterate over all the termination terms
        for name, term_cfg in zip(self._term_names, self._term_cfgs):
            value = term_cfg.func(self._env, **term_cfg.params)
-            # update total termination
-            self._done_buf |= value
            # store timeout signal separately
            if term_cfg.time_out:
-                self._time_out_buf |= value
+                self._truncated_buf |= value
+            else:
+                self._terminated_buf |= value
            # add to episode dones
            self._episode_dones[name] |= value
-        # return termination signal
+        # return combined termination signal
-        return self._done_buf
+        return self._truncated_buf | self._terminated_buf
    """
    Operations - Term settings.

--- a/source/extensions/omni.isaac.orbit/omni/isaac/orbit/sim/simulation_context.py
+++ b/source/extensions/omni.isaac.orbit/omni/isaac/orbit/sim/simulation_context.py
@@ -292,13 +292,13 @@ class SimulationContext(_SimulationContext):
                # hide the viewport and disable updates
                self._viewport_context.updates_enabled = False  # pyright: ignore [reportOptionalMemberAccess]
                self._viewport_window.visible = False  # pyright: ignore [reportOptionalMemberAccess]
-                # reset the throttle counter
-                self._render_throttle_counter = 0
            elif mode == self.RenderMode.NO_RENDERING:
                # hide the viewport and disable updates
                if self._viewport_context is not None:
                    self._viewport_context.updates_enabled = False  # pyright: ignore [reportOptionalMemberAccess]
                    self._viewport_window.visible = False  # pyright: ignore [reportOptionalMemberAccess]
+                # reset the throttle counter
+                self._render_throttle_counter = 0
            else:
                raise ValueError(f"Unsupported render mode: {mode}! Please check `RenderMode` for details.")
            # update render mode
@@ -403,14 +403,21 @@ class SimulationContext(_SimulationContext):
            self._render_throttle_counter += 1
            if self._render_throttle_counter % self._render_throttle_period == 0:
                self._render_throttle_counter = 0
-                # here we don't render viewport so don't need to flush flatcache
+                # here we don't render viewport so don't need to flush fabric data
-                super().render()
+                # note: we don't call super().render() anymore because they do flush the fabric data
+                self.set_setting("/app/player/playSimulations", False)
+                self._app.update()
+                self.set_setting("/app/player/playSimulations", True)
        else:
-            # manually flush the flatcache data to update Hydra textures
+            # manually flush the fabric data to update Hydra textures
            if self._fabric_iface is not None:
                self._fabric_iface.update(0.0, 0.0)
            # render the simulation
-            super().render()
+            # note: we don't call super().render() anymore because they do above operation inside
+            #  and we don't want to do it twice. We may remove it once we drop support for Isaac Sim 2022.2.
+            self.set_setting("/app/player/playSimulations", False)
+            self._app.update()
+            self.set_setting("/app/player/playSimulations", True)
    """
    Operations - Override (extension)

--- a/source/extensions/omni.isaac.orbit/setup.py
+++ b/source/extensions/omni.isaac.orbit/setup.py
@@ -25,18 +25,17 @@ INSTALL_REQUIRES = [
    # devices
    "hidapi",
    # gym
-    "gym==0.21.0",
+    "gymnasium==0.29.0",
-    "importlib-metadata~=4.13.0",
-    "setuptools<=66",  # setuptools 67.0 breaks gym
    # procedural-generation
    "trimesh",
-    "pyglet==1.5.27",  # pyglet 2.0 requires python 3.8
+    "pyglet==1.5.27; python_version < '3.8'",  # pyglet 2.0 requires python 3.8
+    "pyglet; python_version >= '3.8'",
 ]
 # Installation operation
 setup(
    name="omni-isaac-orbit",
-    author="NVIDIA, ETH Zurich, and University of Toronto",
+    author="ORBIT Project Developers",
    maintainer="Mayank Mittal",
    maintainer_email="mittalma@ethz.ch",
    url=EXTENSION_TOML_DATA["package"]["repository"],
@@ -48,6 +47,10 @@ setup(
    python_requires=">=3.7",
    install_requires=INSTALL_REQUIRES,
    packages=["omni.isaac.orbit"],
-    classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
+    classifiers=[
+        "Natural Language :: English",
+        "Programming Language :: Python :: 3.10",
+        "Isaac Sim :: 2023.1.0-hotfix.1",
+    ],
    zip_safe=False,
 )
--- a/source/extensions/omni.isaac.orbit/test/deps/test_torch.py
+++ b/source/extensions/omni.isaac.orbit/test/deps/test_torch.py
@@ -6,6 +6,7 @@
 from __future__ import annotations
 import torch
+import torch.utils.benchmark as benchmark
 import unittest
@@ -124,6 +125,30 @@ class TestTorchOperations(unittest.TestCase):
        my_slice = my_tensor[torch.tensor([0, 1]), ...]
        self.assertNotEqual(my_slice.untyped_storage().data_ptr(), my_tensor.untyped_storage().data_ptr())
+    def test_logical_or(self):
+        """Test bitwise or operation."""
+        size = (400, 300, 5)
+        my_tensor_1 = torch.rand(size, device="cuda:0") > 0.5
+        my_tensor_2 = torch.rand(size, device="cuda:0") < 0.5
+        # check the speed of logical or
+        timer_logical_or = benchmark.Timer(
+            stmt="torch.logical_or(my_tensor_1, my_tensor_2)",
+            globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2},
+        )
+        timer_bitwise_or = benchmark.Timer(
+            stmt="my_tensor_1 | my_tensor_2", globals={"my_tensor_1": my_tensor_1, "my_tensor_2": my_tensor_2}
+        )
+        print("Time for logical or:", timer_logical_or.timeit(number=1000))
+        print("Time for bitwise or:", timer_bitwise_or.timeit(number=1000))
+        # check that logical or works as expected
+        output_logical_or = torch.logical_or(my_tensor_1, my_tensor_2)
+        output_bitwise_or = my_tensor_1 | my_tensor_2
+        self.assertTrue(torch.allclose(output_logical_or, output_bitwise_or))
 if __name__ == "__main__":
    unittest.main()
--- a/source/extensions/omni.isaac.orbit_tasks/config/extension.toml
+++ b/source/extensions/omni.isaac.orbit_tasks/config/extension.toml
 [package]
 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.5.0"
+version = "0.5.1"
 # Description
 title = "ORBIT Environments"

--- a/source/extensions/omni.isaac.orbit_tasks/docs/CHANGELOG.rst
+++ b/source/extensions/omni.isaac.orbit_tasks/docs/CHANGELOG.rst
 Changelog
 ---------
+0.5.1 (2023-11-04)
+~~~~~~~~~~~~~~~~~~
+Fixed
+^^^^^
+* Fixed the wrappers to different learning frameworks to use the new :class:`omni.isaac.orbit_tasks.RLTaskEnv` class.
+  The :class:`RLTaskEnv` class inherits from the :class:`gymnasium.Env` class (Gym 0.29.0).
+* Fixed the registration of tasks in the Gym registry based on Gym 0.29.0 API.
+Changed
+^^^^^^^
+* Removed the inheritance of all the RL-framework specific wrappers from the :class:`gymnasium.Wrapper` class.
+  This is because the wrappers don't comply with the new Gym 0.29.0 API. The wrappers are now only inherit
+  from their respective RL-framework specific base classes.
 0.5.0 (2023-10-30)
 ~~~~~~~~~~~~~~~~~~

--- a/source/extensions/omni.isaac.orbit_tasks/docs/README.md
+++ b/source/extensions/omni.isaac.orbit_tasks/docs/README.md
@@ -17,28 +17,31 @@ This looks like as follows:
 omni/isaac/orbit_tasks/locomotion/
 ├── __init__.py
 └── velocity
-    ├── a1
+    ├── config
-    │   └── flat_terrain_cfg.py
+    │   └── anymal_c
-    ├── anymal_c
+    │       ├── agent  # <- this is where we store the learning agent configurations
-    │   └── flat_terrain_cfg.py
+    │       ├── __init__.py  # <- this is where we register the environment and configurations to gym registry
+    │       ├── flat_env_cfg.py
+    │       └── rough_env_cfg.py
    ├── __init__.py
-    ├── velocity_cfg.py
+    └── velocity_env_cfg.py  # <- this is the base task configuration
-    └── velocity_env.py
 ```
-The environments are then registered in the `omni/isaac/orbit_tasks/__init__.py`:
+The environments are then registered in the `omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_c/__init__.py`:
 ```python
 gym.register(
-    id="Isaac-Velocity-Anymal-C-v0",
+    id="Isaac-Velocity-Rough-Anymal-C-v0",
-    entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv",
+    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
-    kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.anymal_c.flat_terrain_cfg:FlatTerrainCfg"},
+    disable_env_checker=True,
+    kwargs={"env_cfg_entry_point": f"{__name__}.rough_env_cfg:AnymalCRoughEnvCfg"},
 )
 gym.register(
-    id="Isaac-Velocity-A1-v0",
+    id="Isaac-Velocity-Flat-Anymal-C-v0",
-    entry_point="omni.isaac.orbit_tasks.locomotion.velocity:LocomotionEnv",
+    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
-    kwargs={"cfg_entry_point": "omni.isaac.orbit_tasks.locomotion.velocity.a1.flat_terrain_cfg:FlatTerrainCfg"},
+    disable_env_checker=True,
+    kwargs={"env_cfg_entry_point": f"{__name__}.flat_env_cfg:AnymalCFlatEnvCfg"},
 )
 ```

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/__init__.py
@@ -9,7 +9,7 @@
 We use OpenAI Gym registry to register the environment and their default configuration file.
 The default configuration file is passed to the argument "kwargs" in the Gym specification registry.
 The string is parsed into respective configuration container which needs to be passed to the environment
-class. This is done using the function :meth:`load_default_env_cfg` in the sub-module
+class. This is done using the function :meth:`load_cfg_from_registry` in the sub-module
 :mod:`omni.isaac.orbit.utils.parse_cfg`.
 Note:
@@ -18,12 +18,12 @@ Note:
    the kwarg argument :obj:`cfg` while creating the environment.
 Usage:
-    >>> import gym
+    >>> import gymnasium as gym
    >>> import omni.isaac.orbit_tasks
-    >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_default_env_cfg
+    >>> from omni.isaac.orbit_tasks.utils.parse_cfg import load_cfg_from_registry
    >>>
    >>> task_name = "Isaac-Cartpole-v0"
-    >>> cfg = load_default_env_cfg(task_name)
+    >>> cfg = load_cfg_from_registry(task_name, "env_cfg_entry_point")
    >>> env = gym.make(task_name, cfg=cfg)
 """

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/ant/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/ant/__init__.py
@@ -7,7 +7,7 @@
 Ant locomotion environment (similar to OpenAI Gym Ant-v2).
 """
-import gym
+import gymnasium as gym
 from . import agents

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/ant/ant_env.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/ant/ant_env.py
@@ -5,7 +5,7 @@
 from __future__ import annotations
-import gym.spaces
+import gymnasium as gym
 import math
 import torch

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/cartpole/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/cartpole/__init__.py
@@ -7,7 +7,7 @@
 Cartpole balancing environment.
 """
-import gym
+import gymnasium as gym
 from . import agents

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/cartpole/cartpole_env.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/cartpole/cartpole_env.py
@@ -5,7 +5,7 @@
 from __future__ import annotations
-import gym.spaces
+import gymnasium as gym
 import math
 import torch

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/humanoid/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/humanoid/__init__.py
@@ -7,7 +7,7 @@
 Humanoid locomotion environment (similar to OpenAI Gym Humanoid-v2).
 """
-import gym
+import gymnasium as gym
 from . import agents

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/humanoid/humanoid_env.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/classic/humanoid/humanoid_env.py
@@ -5,7 +5,7 @@
 from __future__ import annotations
-import gym.spaces
+import gymnasium as gym
 import math
 import torch

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_b/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_b/__init__.py
@@ -3,7 +3,7 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
-import gym
+import gymnasium as gym
 from . import agents, flat_env_cfg, rough_env_cfg
@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-B-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
@@ -23,6 +24,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-B-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalBFlatEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBFlatPPORunnerCfg,
@@ -32,6 +34,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-B-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,
@@ -41,6 +44,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-B-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalBRoughEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalBRoughPPORunnerCfg,

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_c/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_c/__init__.py
@@ -3,7 +3,7 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
-import gym
+import gymnasium as gym
 from . import agents, flat_env_cfg, rough_env_cfg
@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-C-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
@@ -24,6 +25,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-C-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalCFlatEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCFlatPPORunnerCfg,
@@ -33,6 +35,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-C-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,
@@ -42,6 +45,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-C-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalCRoughEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalCRoughPPORunnerCfg,

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_d/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/anymal_d/__init__.py
@@ -3,7 +3,7 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
-import gym
+import gymnasium as gym
 from . import agents, flat_env_cfg, rough_env_cfg
@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-D-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
@@ -23,6 +24,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Flat-Anymal-D-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.AnymalDFlatEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDFlatPPORunnerCfg,
@@ -32,6 +34,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-D-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,
@@ -41,6 +44,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Anymal-D-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.AnymalDRoughEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.AnymalDRoughPPORunnerCfg,

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/unitree_a1/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/config/unitree_a1/__init__.py
@@ -3,7 +3,7 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
-import gym
+import gymnasium as gym
 from . import agents, flat_env_cfg, rough_env_cfg
@@ -14,6 +14,7 @@ from . import agents, flat_env_cfg, rough_env_cfg
 gym.register(
    id="Isaac-Velocity-Flat-Unitree-A1-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
@@ -23,6 +24,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Flat-Unitree-A1-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": flat_env_cfg.UnitreeA1FlatEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1FlatPPORunnerCfg,
@@ -32,6 +34,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Unitree-A1-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,
@@ -41,6 +44,7 @@ gym.register(
 gym.register(
    id="Isaac-Velocity-Rough-Unitree-A1-Play-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": rough_env_cfg.UnitreeA1RoughEnvCfg_PLAY,
        "rsl_rl_cfg_entry_point": agents.rsl_rl_cfg.UnitreeA1RoughPPORunnerCfg,

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/velocity_env_cfg.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/locomotion/velocity/velocity_env_cfg.py
@@ -65,7 +65,7 @@ class MySceneCfg(InteractiveSceneCfg):
        offset=RayCasterCfg.OffsetCfg(pos=(0.0, 0.0, 20.0)),
        attach_yaw_only=True,
        pattern_cfg=patterns.GridPatternCfg(resolution=0.1, size=[1.6, 1.0]),
-        debug_vis=True,
+        debug_vis=False,
        mesh_prim_paths=["/World/ground"],
    )
    contact_forces = ContactSensorCfg(prim_path="{ENV_REGEX_NS}/Robot/.*", history_length=3, track_air_time=True)

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/__init__.py
@@ -7,7 +7,7 @@
 Environment for lifting an object with fixed-base robot.
 """
-import gym
+import gymnasium as gym
 from . import agents
@@ -18,6 +18,7 @@ from . import agents
 gym.register(
    id="Isaac-Lift-Franka-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": f"{__name__}.lift_env_cfg:LiftEnvCfg",
        "rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/lift_env.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/lift/lift_env.py
@@ -5,7 +5,7 @@
 from __future__ import annotations
-import gym.spaces
+import gymnasium as gym
 import math
 import torch

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/__init__.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/__init__.py
@@ -5,7 +5,7 @@
 """Environment for end-effector pose tracking task for fixed-arm robots."""
-import gym
+import gymnasium as gym
 from . import agents
@@ -16,6 +16,7 @@ from . import agents
 gym.register(
    id="Isaac-Reach-Franka-v0",
    entry_point="omni.isaac.orbit.envs:RLTaskEnv",
+    disable_env_checker=True,
    kwargs={
        "env_cfg_entry_point": f"{__name__}.reach_env_cfg:ReachEnvCfg",
        "rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/reach_env.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/manipulation/reach/reach_env.py
@@ -5,7 +5,7 @@
 from __future__ import annotations
-import gym.spaces
+import gymnasium as gym
 import math
 import torch

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/parse_cfg.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/parse_cfg.py
@@ -7,7 +7,7 @@
 from __future__ import annotations
-import gym
+import gymnasium as gym
 import importlib
 import inspect
 import os
@@ -52,7 +52,7 @@ def load_cfg_from_registry(task_name: str, entry_point_key: str) -> dict | Any:
        ValueError: If the entry point key is not available in the gym registry for the task.
    """
    # obtain the configuration entry point
-    cfg_entry_point = gym.spec(task_name)._kwargs.pop(entry_point_key)
+    cfg_entry_point = gym.spec(task_name).kwargs.pop(entry_point_key)
    # check if entry point exists
    if cfg_entry_point is None:
        raise ValueError(

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/rl_games.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/rl_games.py
@@ -33,7 +33,7 @@ for RL-Games :class:`Runner` class:
 from __future__ import annotations
-import gym
+import gymnasium as gym
 import torch
 from rl_games.common import env_configurations
@@ -49,10 +49,10 @@ Vectorized environment wrapper.
 """
-class RlGamesVecEnvWrapper(gym.Wrapper):
+class RlGamesVecEnvWrapper(IVecEnv):
-    """Wraps around Isaac Orbit environment for RL-Games.
+    """Wraps around Orbit environment for RL-Games.
-    This class wraps around the Isaac Orbit environment. Since RL-Games works directly on
+    This class wraps around the Orbit environment. Since RL-Games works directly on
    GPU buffers, the wrapper handles moving of buffers from the simulation environment
    to the same device as the learning agent. Additionally, it performs clipping of
    observations and actions.
@@ -69,6 +69,13 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
    checks if these attributes exist. If they don't then the wrapper defaults to zero as number
    of privileged observations.
+    .. caution::
+        This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
+        the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
+        wrapper.
    Reference:
        https://github.com/Denys88/rl_games/blob/master/rl_games/common/ivecenv.py
        https://github.com/NVIDIA-Omniverse/IsaacGymEnvs
@@ -85,30 +92,77 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
        Raises:
            ValueError: The environment is not inherited from :class:`RLTaskEnv`.
+            ValueError: If specified, the privileged observations (critic) are not of type :obj:`gym.spaces.Box`.
        """
        # check that input is valid
        if not isinstance(env.unwrapped, RLTaskEnv):
            raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
-        # initialize gym wrapper
+        # initialize the wrapper
-        gym.Wrapper.__init__(self, env)
+        self.env = env
-        # initialize rl-games vec-env
-        IVecEnv.__init__(self)
        # store provided arguments
        self._rl_device = rl_device
        self._clip_obs = clip_obs
        self._clip_actions = clip_actions
+        self._sim_device = env.unwrapped.device
        # information about spaces for the wrapper
-        self.observation_space = self.env.observation_space
+        # note: rl-games only wants single observation and action spaces
-        self.action_space = self.env.action_space
+        self.rlg_observation_space = self.unwrapped.single_observation_space["policy"]
+        self.rlg_action_space = self.unwrapped.single_action_space
        # information for privileged observations
-        self.state_space = getattr(self.env, "state_space", None)
+        self.rlg_state_space = self.unwrapped.single_observation_space.get("critic")
-        self.num_states = getattr(self.env, "num_states", 0)
+        if self.rlg_state_space is not None:
-        # print information about wrapper
+            if not isinstance(self.rlg_state_space, gym.spaces.Box):
-        print("[INFO]: RL-Games Environment Wrapper:")
+                raise ValueError(f"Privileged observations must be of type Box. Type: {type(self.rlg_state_space)}")
-        print(f"\t\t Observations clipping: {clip_obs}")
+            self.rlg_num_states = self.rlg_state_space.shape[0]
-        print(f"\t\t Actions clipping     : {clip_actions}")
+        else:
-        print(f"\t\t Agent device         : {rl_device}")
+            self.rlg_num_states = 0
-        print(f"\t\t Asymmetric-learning  : {self.num_states != 0}")
+    def __str__(self):
+        """Returns the wrapper name and the :attr:`env` representation string."""
+        return (
+            f"<{type(self).__name__}{self.env}>"
+            f"\n\tObservations clipping: {self._clip_obs}"
+            f"\n\tActions clipping     : {self._clip_actions}"
+            f"\n\tAgent device         : {self._rl_device}"
+            f"\n\tAsymmetric-learning  : {self.rlg_num_states != 0}"
+        )
+    def __repr__(self):
+        """Returns the string representation of the wrapper."""
+        return str(self)
+    """
+    Properties -- Gym.Wrapper
+    """
+    @property
+    def render_mode(self) -> str | None:
+        """Returns the :attr:`Env` :attr:`render_mode`."""
+        return self.env.render_mode
+    @property
+    def observation_space(self) -> gym.Space:
+        """Returns the :attr:`Env` :attr:`observation_space`."""
+        return self.env.observation_space
+    @property
+    def action_space(self) -> gym.Space:
+        """Returns the :attr:`Env` :attr:`action_space`."""
+        return self.env.action_space
+    @classmethod
+    def class_name(cls) -> str:
+        """Returns the class name of the wrapper."""
+        return cls.__name__
+    @property
+    def unwrapped(self) -> RLTaskEnv:
+        """Returns the base environment of the wrapper.
+        This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
+        """
+        return self.env.unwrapped
    """
    Properties
@@ -120,40 +174,46 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
    def get_env_info(self) -> dict:
        """Returns the Gym spaces for the environment."""
-        # fill the env info dict
+        return {
-        env_info = {"observation_space": self.observation_space, "action_space": self.action_space}
+            "observation_space": self.rlg_observation_space,
-        # add information about privileged observations space
+            "action_space": self.rlg_action_space,
-        if self.num_states > 0:
+            "state_space": self.rlg_state_space,
-            env_info["state_space"] = self.state_space
+        }
-        return env_info
    """
    Operations - MDP
    """
+    def seed(self, seed: int = -1) -> int:  # noqa: D102
+        return self.unwrapped.seed(seed)
    def reset(self):  # noqa: D102
-        obs_dict = self.env.reset()
+        obs_dict, _ = self.env.reset()
        # process observations and states
        return self._process_obs(obs_dict)
    def step(self, actions):  # noqa: D102
+        # move actions to sim-device
+        actions = actions.detach().clone().to(device=self._sim_device)
        # clip the actions
-        actions = torch.clamp(actions.clone(), -self._clip_actions, self._clip_actions)
+        actions = torch.clamp(actions, -self._clip_actions, self._clip_actions)
        # perform environment step
-        obs_dict, rew, dones, extras = self.env.step(actions)
+        obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
        # process observations and states
        obs_and_states = self._process_obs(obs_dict)
        # move buffers to rl-device
        # note: we perform clone to prevent issues when rl-device and sim-device are the same.
-        rew = rew.to(self._rl_device)
+        rew = rew.to(device=self._rl_device)
-        dones = dones.to(self._rl_device)
+        dones = (terminated | truncated).to(device=self._rl_device)
        extras = {
            k: v.to(device=self._rl_device, non_blocking=True) if hasattr(v, "to") else v for k, v in extras.items()
        }
        return obs_and_states, rew, dones, extras
+    def close(self):  # noqa: D102
+        return self.env.close()
    """
    Helper functions
    """
@@ -163,34 +223,29 @@ class RlGamesVecEnvWrapper(gym.Wrapper):
        Note:
            States typically refers to privileged observations for the critic function. It is typically used in
-            asymmetric actor-critic algorithms [1].
+            asymmetric actor-critic algorithms.
        Args:
-            obs: The current observations from environment.
+            obs_dict: The current observations from environment.
        Returns:
-            If environment provides states, then a dictionary
+            If environment provides states, then a dictionary containing the observations and states is returned.
-                containing the observations and states is returned. Otherwise just the observations tensor
+            Otherwise just the observations tensor is returned.
-                is returned.
-        Reference:
-            1. Pinto, Lerrel, et al. "Asymmetric actor critic for image-based robot learning."
-               arXiv preprint arXiv:1710.06542 (2017).
        """
        # process policy obs
        obs = obs_dict["policy"]
        # clip the observations
        obs = torch.clamp(obs, -self._clip_obs, self._clip_obs)
        # move the buffer to rl-device
-        obs = obs.to(self._rl_device).clone()
+        obs = obs.to(device=self._rl_device).clone()
        # check if asymmetric actor-critic or not
-        if self.num_states > 0:
+        if self.rlg_num_states > 0:
            # acquire states from the environment if it exists
            try:
                states = obs_dict["critic"]
            except AttributeError:
-                raise NotImplementedError("Environment does not define key `critic` for privileged observations.")
+                raise NotImplementedError("Environment does not define key 'critic' for privileged observations.")
            # clip the states
            states = torch.clamp(states, -self._clip_obs, self._clip_obs)
            # move buffers to rl-device

--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/rsl_rl/vecenv_wrapper.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/rsl_rl/vecenv_wrapper.py
@@ -17,22 +17,28 @@ The following example shows how to wrap an environment for RSL-RL:
 from __future__ import annotations
-import gym
+import gymnasium as gym
-import gym.spaces
 import torch
+from rsl_rl.env import VecEnv
 from omni.isaac.orbit.envs import RLTaskEnv
-class RslRlVecEnvWrapper(gym.Wrapper):
+class RslRlVecEnvWrapper(VecEnv):
-    """Wraps around Isaac Orbit environment for RSL-RL library
+    """Wraps around Orbit environment for RSL-RL library
+    To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_privileged_obs` (int).
+    This is used by the learning agent to allocate buffers in the trajectory memory. Additionally, the returned
+    observations should have the key "critic" which corresponds to the privileged observations. Since this is
+    optional for some environments, the wrapper checks if these attributes exist. If they don't then the wrapper
+    defaults to zero as number of privileged observations.
+    .. caution::
-    To use asymmetric actor-critic, the environment instance must have the attributes :attr:`num_states` (int)
+        This class must be the last wrapper in the wrapper chain. This is because the wrapper does not follow
-    and :attr:`state_space` (:obj:`gym.spaces.Box`). These are used by the learning agent to allocate buffers in
+        the :class:`gym.Wrapper` interface. Any subsequent wrappers will need to be modified to work with this
-    the trajectory memory. Additionally, the method :meth:`_get_observations()` should have the key "critic"
+        wrapper.
-    which corresponds to the privileged observations. Since this is optional for some environments, the wrapper
-    checks if these attributes exist. If they don't then the wrapper defaults to zero as number of privileged
-    observations.
    Reference:
        https://github.com/leggedrobotics/rsl_rl/blob/master/rsl_rl/env/vec_env.py
@@ -41,6 +47,9 @@ class RslRlVecEnvWrapper(gym.Wrapper):
    def __init__(self, env: RLTaskEnv):
        """Initializes the wrapper.
+        Note:
+            The wrapper calls :meth:`reset` at the start since the RSL-RL runner does not call reset.
        Args:
            env: The environment to wrap around.
@@ -51,28 +60,74 @@ class RslRlVecEnvWrapper(gym.Wrapper):
        if not isinstance(env.unwrapped, RLTaskEnv):
            raise ValueError(f"The environment must be inherited from RLTaskEnv. Environment type: {type(env)}")
        # initialize the wrapper
-        gym.Wrapper.__init__(self, env)
+        self.env = env
        # store information required by wrapper
-        orbit_env: RLTaskEnv = self.env.unwrapped
+        self.num_envs = self.unwrapped.num_envs
-        self.num_envs = orbit_env.num_envs
+        self.device = self.unwrapped.device
-        self.num_actions = orbit_env.action_manager.total_action_dim
+        self.max_episode_length = self.unwrapped.max_episode_length
-        self.num_obs = orbit_env.observation_manager.group_obs_dim["policy"][0]
+        self.num_actions = self.unwrapped.action_manager.total_action_dim
+        self.num_obs = self.unwrapped.observation_manager.group_obs_dim["policy"][0]
+        # -- privileged observations
+        if "critic" in self.unwrapped.observation_manager.group_obs_dim:
+            self.num_privileged_obs = self.unwrapped.observation_manager.group_obs_dim["critic"][0]
+        else:
+            self.num_privileged_obs = 0
        # reset at the start since the RSL-RL runner does not call reset
        self.env.reset()
+    def __str__(self):
+        """Returns the wrapper name and the :attr:`env` representation string."""
+        return f"<{type(self).__name__}{self.env}>"
+    def __repr__(self):
+        """Returns the string representation of the wrapper."""
+        return str(self)
+    """
+    Properties -- Gym.Wrapper
+    """
+    @property
+    def render_mode(self) -> str | None:
+        """Returns the :attr:`Env` :attr:`render_mode`."""
+        return self.env.render_mode
+    @property
+    def observation_space(self) -> gym.Space:
+        """Returns the :attr:`Env` :attr:`observation_space`."""
+        return self.env.observation_space
+    @property
+    def action_space(self) -> gym.Space:
+        """Returns the :attr:`Env` :attr:`action_space`."""
+        return self.env.action_space
+    @classmethod
+    def class_name(cls) -> str:
+        """Returns the class name of the wrapper."""
+        return cls.__name__
+    @property
+    def unwrapped(self) -> RLTaskEnv:
+        """Returns the base environment of the wrapper.
+        This will be the bare :class:`gymnasium.Env` environment, underneath all layers of wrappers.
+        """
+        return self.env.unwrapped
    """
    Properties
    """
-    def get_observations(self) -> torch.Tensor:
+    def get_observations(self) -> tuple[torch.Tensor, dict]:
        """Returns the current observations of the environment."""
-        obs_dict = self.env.unwrapped.observation_manager.compute()
+        obs_dict = self.unwrapped.observation_manager.compute()
        return obs_dict["policy"], {"observations": obs_dict}
    @property
    def episode_length_buf(self) -> torch.Tensor:
        """The episode length buffer."""
-        return self.env.unwrapped.episode_length_buf
+        return self.unwrapped.episode_length_buf
    @episode_length_buf.setter
    def episode_length_buf(self, value: torch.Tensor):
@@ -80,22 +135,34 @@ class RslRlVecEnvWrapper(gym.Wrapper):
        Note: This is needed to perform random initialization of episode lengths in RSL-RL.
        """
-        self.env.unwrapped.episode_length_buf = value
+        self.unwrapped.episode_length_buf = value
    """
    Operations - MDP
    """
-    def reset(self) -> tuple[torch.Tensor, dict]:
+    def seed(self, seed: int = -1) -> int:  # noqa: D102
+        return self.unwrapped.seed(seed)
+    def reset(self) -> tuple[torch.Tensor, dict]:  # noqa: D102
        # reset the environment
-        obs_dict = self.env.reset()
+        obs_dict, _ = self.env.reset()
        # return observations
        return obs_dict["policy"], {"observations": obs_dict}
    def step(self, actions: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, dict]:
        # record step information
-        obs_dict, rew, dones, extras = self.env.step(actions)
+        obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
-        # return step information
+        # compute dones for compatibility with RSL-RL
+        dones = (terminated | truncated).to(dtype=torch.long)
+        # move extra observations to the extras dict
        obs = obs_dict["policy"]
        extras["observations"] = obs_dict
+        # move time out information to the extras dict
+        extras["time_outs"] = truncated
+        # return the step information
        return obs, rew, dones, extras
+    def close(self):  # noqa: D102
+        return self.env.close()
--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/sb3.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/sb3.py
--- a/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/skrl.py
+++ b/source/extensions/omni.isaac.orbit_tasks/omni/isaac/orbit_tasks/utils/wrappers/skrl.py
@@ -93,9 +93,9 @@ Vectorized environment wrapper.
 def SkrlVecEnvWrapper(env: RLTaskEnv):
-    """Wraps around Isaac Orbit environment for skrl.
+    """Wraps around Orbit environment for skrl.
-    This function wraps around the Isaac Orbit environment. Since the :class:`RLTaskEnv` environment
+    This function wraps around the Orbit environment. Since the :class:`RLTaskEnv` environment
    wrapping functionality is defined within the skrl library itself, this implementation
    is maintained for compatibility with the structure of the extension that contains it.
    Internally it calls the :func:`wrap_env` from the skrl library API.

--- a/source/extensions/omni.isaac.orbit_tasks/setup.py
+++ b/source/extensions/omni.isaac.orbit_tasks/setup.py
@@ -22,18 +22,21 @@ INSTALL_REQUIRES = [
    "numpy",
    "torch",
    "torchvision>=0.14.1",  # ensure compatibility with torch 1.13.1
-    "protobuf==3.20.2",
+    "protobuf>=3.20.2",
    # data collection
    "h5py",
+    # basic logger
+    "tensorboard",
+    # video recording
+    "moviepy",
 ]
 # Extra dependencies for RL agents
 EXTRAS_REQUIRE = {
-    "sb3": ["stable-baselines3>=1.5,<=1.8", "tensorboard"],
+    "sb3": ["stable-baselines3>=2.0"],
    "skrl": ["skrl>=0.10.0"],
-    "rl_games": ["rl-games==1.5.2"],
+    "rl_games": ["rl-games==1.6.1"],
-    # TODO: Uncomment when rsl_rl is updated to public.
+    "rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
-    # "rsl_rl": ["rsl_rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
    "robomimic": ["robomimic@git+https://github.com/ARISE-Initiative/robomimic.git"],
 }
 # cumulation of all extra-requires
@@ -43,7 +46,7 @@ EXTRAS_REQUIRE["all"] = list(itertools.chain.from_iterable(EXTRAS_REQUIRE.values
 # Installation operation
 setup(
    name="omni-isaac-orbit_tasks",
-    author="NVIDIA, ETH Zurich, and University of Toronto",
+    author="ORBIT Project Developers",
    maintainer="Mayank Mittal",
    maintainer_email="mittalma@ethz.ch",
    url=EXTENSION_TOML_DATA["package"]["repository"],
@@ -55,6 +58,10 @@ setup(
    install_requires=INSTALL_REQUIRES,
    extras_require=EXTRAS_REQUIRE,
    packages=["omni.isaac.orbit_tasks"],
-    classifiers=["Natural Language :: English", "Programming Language :: Python :: 3.7"],
+    classifiers=[
+        "Natural Language :: English",
+        "Programming Language :: Python :: 3.10",
+        "Isaac Sim :: 2023.1.0-hotfix.1",
+    ],
    zip_safe=False,
 )
--- a/source/extensions/omni.isaac.orbit_tasks/test/test_environments.py
+++ b/source/extensions/omni.isaac.orbit_tasks/test/test_environments.py
@@ -20,8 +20,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
-import gym.envs
 import torch
 import traceback
 import unittest
@@ -42,7 +41,7 @@ class TestEnvironments(unittest.TestCase):
    def setUpClass(cls):
        # acquire all Isaac environments names
        cls.registered_tasks = list()
-        for task_spec in gym.envs.registry.all():
+        for task_spec in gym.registry.values():
            if "Isaac" in task_spec.id:
                cls.registered_tasks.append(task_spec.id)
        # sort environments by name
@@ -70,19 +69,20 @@ class TestEnvironments(unittest.TestCase):
            env: RLTaskEnv = gym.make(task_name, cfg=env_cfg)
            # reset environment
-            obs = env.reset()
+            obs, _ = env.reset()
            # check signal
            self.assertTrue(self._check_valid_tensor(obs))
            # simulate environment for 1000 steps
-            for _ in range(1000):
+            with torch.inference_mode():
-                # sample actions from -1 to 1
+                for _ in range(1000):
-                actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
+                    # sample actions from -1 to 1
-                # apply actions
+                    actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
-                transition = env.step(actions)
+                    # apply actions
-                # check signals
+                    transition = env.step(actions)
-                for data in transition:
+                    # check signals
-                    self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
+                    for data in transition:
+                        self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
            # close the environment
            print(f">>> Closing environment: {task_name}")
@@ -108,9 +108,9 @@ class TestEnvironments(unittest.TestCase):
            valid_tensor = True
            for value in data.values():
                if isinstance(value, dict):
-                    return TestEnvironments._check_valid_tensor(value)
+                    valid_tensor &= TestEnvironments._check_valid_tensor(value)
                elif isinstance(value, torch.Tensor):
-                    valid_tensor = valid_tensor and not torch.any(torch.isnan(value))
+                    valid_tensor &= not torch.any(torch.isnan(value))
            return valid_tensor
        else:
            raise ValueError(f"Input data of invalid type: {type(data)}.")

--- a/source/extensions/omni.isaac.orbit_tasks/test/test_record_video.py
+++ b/source/extensions/omni.isaac.orbit_tasks/test/test_record_video.py
@@ -19,7 +19,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import os
 import torch
 import traceback
@@ -42,7 +42,7 @@ class TestRecordVideoWrapper(unittest.TestCase):
    def setUpClass(cls):
        # acquire all Isaac environments names
        cls.registered_tasks = list()
-        for task_spec in gym.envs.registry.all():
+        for task_spec in gym.registry.values():
            if "Isaac" in task_spec.id:
                cls.registered_tasks.append(task_spec.id)
        # sort environments by name
@@ -73,25 +73,24 @@ class TestRecordVideoWrapper(unittest.TestCase):
            env_cfg.sim.shutdown_app_on_stop = False
            # create environment
-            env: RLTaskEnv = gym.make(task_name, cfg=env_cfg)
+            env: RLTaskEnv = gym.make(task_name, cfg=env_cfg, render_mode="rgb_array")
            # directory to save videos
            videos_dir = os.path.join(self.videos_dir, task_name)
            # wrap environment to record videos
            env = gym.wrappers.RecordVideo(
-                env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length
+                env, videos_dir, step_trigger=self.step_trigger, video_length=self.video_length, disable_logger=True
            )
            # reset environment
            env.reset()
            # simulate environment
-            for _ in range(500):
+            with torch.inference_mode():
-                # compute zero actions
+                for _ in range(500):
-                actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
+                    # compute zero actions
-                # apply actions
+                    actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
-                _ = env.step(actions)
+                    # apply actions
-                # render environment
+                    _ = env.step(actions)
-                env.render(mode="human")
            # close the simulator
            env.close()

--- a/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_rl_games_wrapper.py
+++ b/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_rl_games_wrapper.py
+# Copyright (c) 2022-2023, The ORBIT Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+from __future__ import annotations
+"""Launch Isaac Sim Simulator first."""
+import os
+from omni.isaac.orbit.app import AppLauncher
+# launch the simulator
+app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
+app_launcher = AppLauncher(headless=True, experience=app_experience)
+simulation_app = app_launcher.app
+"""Rest everything follows."""
+import gymnasium as gym
+import torch
+import traceback
+import unittest
+import carb
+import omni.usd
+from omni.isaac.orbit.envs import RLTaskEnvCfg
+import omni.isaac.orbit_tasks  # noqa: F401
+from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
+from omni.isaac.orbit_tasks.utils.wrappers.rl_games import RlGamesVecEnvWrapper
+class TestRlGamesVecEnvWrapper(unittest.TestCase):
+    """Test that RL-Games VecEnv wrapper works as expected."""
+    @classmethod
+    def setUpClass(cls):
+        # acquire all Isaac environments names
+        cls.registered_tasks = list()
+        for task_spec in gym.registry.values():
+            if "Isaac" in task_spec.id:
+                cls.registered_tasks.append(task_spec.id)
+        # sort environments by name
+        cls.registered_tasks.sort()
+        # only pick the first three environments to test
+        cls.registered_tasks = cls.registered_tasks[:3]
+        # print all existing task names
+        print(">>> All registered environments:", cls.registered_tasks)
+    def setUp(self) -> None:
+        # common parameters
+        self.num_envs = 512
+        self.use_gpu = True
+    def test_random_actions(self):
+        """Run random actions and check environments return valid signals."""
+        for task_name in self.registered_tasks:
+            print(f">>> Running test for environment: {task_name}")
+            # create a new stage
+            omni.usd.get_context().new_stage()
+            # parse configuration
+            env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
+            # note: we don't want to shutdown the app on stop during the tests since we reload the stage
+            env_cfg.sim.shutdown_app_on_stop = False
+            # create environment
+            env = gym.make(task_name, cfg=env_cfg)
+            # wrap environment
+            env = RlGamesVecEnvWrapper(env, "cuda:0", 100, 100)
+            # reset environment
+            obs = env.reset()
+            # check signal
+            self.assertTrue(self._check_valid_tensor(obs))
+            # simulate environment for 100 steps
+            with torch.inference_mode():
+                for _ in range(100):
+                    # sample actions from -1 to 1
+                    actions = 2 * torch.rand(env.action_space.shape, device=env.device) - 1
+                    # apply actions
+                    transition = env.step(actions)
+                    # check signals
+                    for data in transition:
+                        self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
+            # close the environment
+            print(f">>> Closing environment: {task_name}")
+            env.close()
+    """
+    Helper functions.
+    """
+    @staticmethod
+    def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
+        """Checks if given data does not have corrupted values.
+        Args:
+            data: Data buffer.
+        Returns:
+            True if the data is valid.
+        """
+        if isinstance(data, torch.Tensor):
+            return not torch.any(torch.isnan(data))
+        elif isinstance(data, dict):
+            valid_tensor = True
+            for value in data.values():
+                if isinstance(value, dict):
+                    valid_tensor &= TestRlGamesVecEnvWrapper._check_valid_tensor(value)
+                elif isinstance(value, torch.Tensor):
+                    valid_tensor &= not torch.any(torch.isnan(value))
+            return valid_tensor
+        else:
+            raise ValueError(f"Input data of invalid type: {type(data)}.")
+if __name__ == "__main__":
+    try:
+        unittest.main()
+    except Exception as err:
+        carb.log_error(err)
+        carb.log_error(traceback.format_exc())
+        raise
+    finally:
+        # close sim app
+        simulation_app.close()
--- a/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_rsl_rl_wrapper.py
+++ b/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_rsl_rl_wrapper.py
+# Copyright (c) 2022-2023, The ORBIT Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+from __future__ import annotations
+"""Launch Isaac Sim Simulator first."""
+import os
+from omni.isaac.orbit.app import AppLauncher
+# launch the simulator
+app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
+app_launcher = AppLauncher(headless=True, experience=app_experience)
+simulation_app = app_launcher.app
+"""Rest everything follows."""
+import gymnasium as gym
+import torch
+import traceback
+import unittest
+import carb
+import omni.usd
+from omni.isaac.orbit.envs import RLTaskEnvCfg
+import omni.isaac.orbit_tasks  # noqa: F401
+from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
+from omni.isaac.orbit_tasks.utils.wrappers.rsl_rl import RslRlVecEnvWrapper
+class TestRslRlVecEnvWrapper(unittest.TestCase):
+    """Test that RSL-RL VecEnv wrapper works as expected."""
+    @classmethod
+    def setUpClass(cls):
+        # acquire all Isaac environments names
+        cls.registered_tasks = list()
+        for task_spec in gym.registry.values():
+            if "Isaac" in task_spec.id:
+                cls.registered_tasks.append(task_spec.id)
+        # sort environments by name
+        cls.registered_tasks.sort()
+        # only pick the first three environments to test
+        cls.registered_tasks = cls.registered_tasks[:3]
+        # print all existing task names
+        print(">>> All registered environments:", cls.registered_tasks)
+    def setUp(self) -> None:
+        # common parameters
+        self.num_envs = 512
+        self.use_gpu = True
+    def test_random_actions(self):
+        """Run random actions and check environments return valid signals."""
+        for task_name in self.registered_tasks:
+            print(f">>> Running test for environment: {task_name}")
+            # create a new stage
+            omni.usd.get_context().new_stage()
+            # parse configuration
+            env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
+            # note: we don't want to shutdown the app on stop during the tests since we reload the stage
+            env_cfg.sim.shutdown_app_on_stop = False
+            # create environment
+            env = gym.make(task_name, cfg=env_cfg)
+            # wrap environment
+            env = RslRlVecEnvWrapper(env)
+            # reset environment
+            obs, extras = env.reset()
+            # check signal
+            self.assertTrue(self._check_valid_tensor(obs))
+            self.assertTrue(self._check_valid_tensor(extras))
+            # simulate environment for 1000 steps
+            with torch.inference_mode():
+                for _ in range(1000):
+                    # sample actions from -1 to 1
+                    actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
+                    # apply actions
+                    transition = env.step(actions)
+                    # check signals
+                    for data in transition:
+                        self.assertTrue(self._check_valid_tensor(data), msg=f"Invalid data: {data}")
+            # close the environment
+            print(f">>> Closing environment: {task_name}")
+            env.close()
+    """
+    Helper functions.
+    """
+    @staticmethod
+    def _check_valid_tensor(data: torch.Tensor | dict) -> bool:
+        """Checks if given data does not have corrupted values.
+        Args:
+            data: Data buffer.
+        Returns:
+            True if the data is valid.
+        """
+        if isinstance(data, torch.Tensor):
+            return not torch.any(torch.isnan(data))
+        elif isinstance(data, dict):
+            valid_tensor = True
+            for value in data.values():
+                if isinstance(value, dict):
+                    valid_tensor &= TestRslRlVecEnvWrapper._check_valid_tensor(value)
+                elif isinstance(value, torch.Tensor):
+                    valid_tensor &= not torch.any(torch.isnan(value))
+            return valid_tensor
+        else:
+            raise ValueError(f"Input data of invalid type: {type(data)}.")
+if __name__ == "__main__":
+    try:
+        unittest.main()
+    except Exception as err:
+        carb.log_error(err)
+        carb.log_error(traceback.format_exc())
+        raise
+    finally:
+        # close sim app
+        simulation_app.close()
--- a/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_sb3_wrapper.py
+++ b/source/extensions/omni.isaac.orbit_tasks/test/wrappers/test_sb3_wrapper.py
+# Copyright (c) 2022-2023, The ORBIT Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+from __future__ import annotations
+"""Launch Isaac Sim Simulator first."""
+import os
+from omni.isaac.orbit.app import AppLauncher
+# launch the simulator
+app_experience = f"{os.environ['EXP_PATH']}/omni.isaac.sim.python.gym.headless.kit"
+app_launcher = AppLauncher(headless=True, experience=app_experience)
+simulation_app = app_launcher.app
+"""Rest everything follows."""
+import gymnasium as gym
+import numpy as np
+import torch
+import traceback
+import unittest
+import carb
+import omni.usd
+from omni.isaac.orbit.envs import RLTaskEnvCfg
+import omni.isaac.orbit_tasks  # noqa: F401
+from omni.isaac.orbit_tasks.utils.parse_cfg import parse_env_cfg
+from omni.isaac.orbit_tasks.utils.wrappers.sb3 import Sb3VecEnvWrapper
+class TestStableBaselines3VecEnvWrapper(unittest.TestCase):
+    """Test that RSL-RL VecEnv wrapper works as expected."""
+    @classmethod
+    def setUpClass(cls):
+        # acquire all Isaac environments names
+        cls.registered_tasks = list()
+        for task_spec in gym.registry.values():
+            if "Isaac" in task_spec.id:
+                cls.registered_tasks.append(task_spec.id)
+        # sort environments by name
+        cls.registered_tasks.sort()
+        # only pick the first three environments to test
+        cls.registered_tasks = cls.registered_tasks[:3]
+        # print all existing task names
+        print(">>> All registered environments:", cls.registered_tasks)
+    def setUp(self) -> None:
+        # common parameters
+        self.num_envs = 512
+        self.use_gpu = True
+    def test_random_actions(self):
+        """Run random actions and check environments return valid signals."""
+        for task_name in self.registered_tasks:
+            print(f">>> Running test for environment: {task_name}")
+            # create a new stage
+            omni.usd.get_context().new_stage()
+            # parse configuration
+            env_cfg: RLTaskEnvCfg = parse_env_cfg(task_name, use_gpu=self.use_gpu, num_envs=self.num_envs)
+            # note: we don't want to shutdown the app on stop during the tests since we reload the stage
+            env_cfg.sim.shutdown_app_on_stop = False
+            # create environment
+            env = gym.make(task_name, cfg=env_cfg)
+            # wrap environment
+            env = Sb3VecEnvWrapper(env)
+            # reset environment
+            obs = env.reset()
+            # check signal
+            self.assertTrue(self._check_valid_array(obs))
+            # simulate environment for 1000 steps
+            with torch.inference_mode():
+                for _ in range(1000):
+                    # sample actions from -1 to 1
+                    actions = 2 * np.random.rand(env.num_envs, env.action_space.shape) - 1
+                    # apply actions
+                    transition = env.step(actions)
+                    # check signals
+                    for data in transition:
+                        self.assertTrue(self._check_valid_array(data), msg=f"Invalid data: {data}")
+            # close the environment
+            print(f">>> Closing environment: {task_name}")
+            env.close()
+    """
+    Helper functions.
+    """
+    @staticmethod
+    def _check_valid_array(data: np.ndarray | dict | list) -> bool:
+        """Checks if given data does not have corrupted values.
+        Args:
+            data: Data buffer.
+        Returns:
+            True if the data is valid.
+        """
+        if isinstance(data, np.ndarray):
+            return not np.any(np.isnan(data))
+        elif isinstance(data, dict):
+            valid_array = True
+            for value in data.values():
+                if isinstance(value, dict):
+                    valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
+                elif isinstance(value, np.ndarray):
+                    valid_array &= not np.any(np.isnan(value))
+            return valid_array
+        elif isinstance(data, list):
+            valid_array = True
+            for value in data:
+                valid_array &= TestStableBaselines3VecEnvWrapper._check_valid_array(value)
+            return valid_array
+        else:
+            raise ValueError(f"Input data of invalid type: {type(data)}.")
+if __name__ == "__main__":
+    try:
+        unittest.main()
+    except Exception as err:
+        carb.log_error(err)
+        carb.log_error(traceback.format_exc())
+        raise
+    finally:
+        # close sim app
+        simulation_app.close()
--- a/source/standalone/environments/list_envs.py
+++ b/source/standalone/environments/list_envs.py
@@ -27,7 +27,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 from prettytable import PrettyTable
 import omni.isaac.contrib_tasks  # noqa: F401
@@ -47,10 +47,10 @@ def main():
    # count of environments
    index = 0
    # acquire all Isaac environments names
-    for task_spec in gym.envs.registry.all():
+    for task_spec in gym.registry.values():
        if "Isaac" in task_spec.id:
            # add details to table
-            table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec._kwargs["env_cfg_entry_point"]])
+            table.add_row([index + 1, task_spec.id, task_spec.entry_point, task_spec.kwargs["env_cfg_entry_point"]])
            # increment count
            index += 1
@@ -61,6 +61,8 @@ if __name__ == "__main__":
    try:
        # run the main function
        main()
+    except Exception as e:
+        raise e
    finally:
        # close the app
        simulation_app.close()
--- a/source/standalone/environments/random_agent.py
+++ b/source/standalone/environments/random_agent.py
@@ -15,7 +15,7 @@ import argparse
 from omni.isaac.orbit.app import AppLauncher
 # add argparse arguments
-parser = argparse.ArgumentParser(description="Random agent for Isaac Orbit environments.")
+parser = argparse.ArgumentParser(description="Random agent for Orbit environments.")
 parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
 parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
 parser.add_argument("--task", type=str, default=None, help="Name of the task.")
@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback
@@ -43,12 +43,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
 def main():
-    """Random actions agent with Isaac Orbit environment."""
+    """Random actions agent with Orbit environment."""
    # parse configuration
    env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
    # create environment
    env = gym.make(args_cli.task, cfg=env_cfg)
+    # print info (this is vectorized environment)
+    print(f"[INFO]: Gym observation space: {env.observation_space}")
+    print(f"[INFO]: Gym action space: {env.action_space}")
    # reset environment
    env.reset()
    # simulate environment
@@ -56,9 +59,9 @@ def main():
        # run everything in inference mode
        with torch.inference_mode():
            # sample actions from -1 to 1
-            actions = 2 * torch.rand((env.num_envs, env.action_space.shape[0]), device=env.device) - 1
+            actions = 2 * torch.rand(env.action_space.shape, device=env.unwrapped.device) - 1
            # apply actions
-            _, _, _, _ = env.step(actions)
+            env.step(actions)
    # close the simulator
    env.close()

--- a/source/standalone/environments/state_machine/play_lift.py
+++ b/source/standalone/environments/state_machine/play_lift.py
@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
 """Rest everything else."""
-import gym
+import gymnasium as gym
 import torch
 import traceback
 from enum import Enum

--- a/source/standalone/environments/teleoperation/teleop_se3_agent.py
+++ b/source/standalone/environments/teleoperation/teleop_se3_agent.py
@@ -15,7 +15,7 @@ import argparse
 from omni.isaac.orbit.app import AppLauncher
 # add argparse arguments
-parser = argparse.ArgumentParser(description="Keyboard teleoperation for Isaac Orbit environments.")
+parser = argparse.ArgumentParser(description="Keyboard teleoperation for Orbit environments.")
 parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
 parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
 parser.add_argument("--device", type=str, default="keyboard", help="Device for interacting with environment")
@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback

--- a/source/standalone/environments/zero_agent.py
+++ b/source/standalone/environments/zero_agent.py
@@ -15,7 +15,7 @@ import argparse
 from omni.isaac.orbit.app import AppLauncher
 # add argparse arguments
-parser = argparse.ArgumentParser(description="Zero agent for Isaac Orbit environments.")
+parser = argparse.ArgumentParser(description="Zero agent for Orbit environments.")
 parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
 parser.add_argument("--num_envs", type=int, default=None, help="Number of environments to simulate.")
 parser.add_argument("--task", type=str, default=None, help="Name of the task.")
@@ -30,7 +30,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback
@@ -42,12 +42,15 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
 def main():
-    """Zero actions agent with Isaac Orbit environment."""
+    """Zero actions agent with Orbit environment."""
    # parse configuration
    env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=args_cli.num_envs)
    # create environment
    env = gym.make(args_cli.task, cfg=env_cfg)
+    # print info (this is vectorized environment)
+    print(f"[INFO]: Gym observation space: {env.observation_space}")
+    print(f"[INFO]: Gym action space: {env.action_space}")
    # reset environment
    env.reset()
    # simulate environment
@@ -55,9 +58,9 @@ def main():
        # run everything in inference mode
        with torch.inference_mode():
            # compute zero actions
-            actions = torch.zeros((env.num_envs, env.action_space.shape[0]), device=env.device)
+            actions = torch.zeros(env.action_space.shape, device=env.unwrapped.device)
            # apply actions
-            _, _, _, _ = env.step(actions)
+            env.step(actions)
    # close the simulator
    env.close()

--- a/source/standalone/workflows/rl_games/play.py
+++ b/source/standalone/workflows/rl_games/play.py
@@ -37,7 +37,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import math
 import os
 import torch

--- a/source/standalone/workflows/rl_games/train.py
+++ b/source/standalone/workflows/rl_games/train.py
@@ -41,7 +41,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import math
 import os
 import traceback
@@ -96,13 +96,14 @@ def main():
    clip_actions = agent_cfg["params"]["env"].get("clip_actions", math.inf)
    # create isaac environment
-    env = gym.make(args_cli.task, cfg=env_cfg)
+    env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
    # wrap for video recording
    if args_cli.video:
        video_kwargs = {
            "video_folder": os.path.join(log_dir, "videos"),
            "step_trigger": lambda step: step % args_cli.video_interval == 0,
            "video_length": args_cli.video_length,
+            "disable_logger": True,
        }
        print("[INFO] Recording videos during training.")
        print_dict(video_kwargs, nesting=4)

--- a/source/standalone/workflows/robomimic/collect_demonstrations.py
+++ b/source/standalone/workflows/robomimic/collect_demonstrations.py
@@ -3,7 +3,7 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
-"""Script to collect demonstrations with Isaac Orbit environments."""
+"""Script to collect demonstrations with Orbit environments."""
 from __future__ import annotations
@@ -15,7 +15,7 @@ import argparse
 from omni.isaac.orbit.app import AppLauncher
 # add argparse arguments
-parser = argparse.ArgumentParser(description="Collect demonstrations for Isaac Orbit environments.")
+parser = argparse.ArgumentParser(description="Collect demonstrations for Orbit environments.")
 parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
 parser.add_argument("--num_envs", type=int, default=1, help="Number of environments to simulate.")
 parser.add_argument("--task", type=str, default=None, help="Name of the task.")
@@ -35,7 +35,7 @@ simulation_app = app_launcher.app
 import contextlib
-import gym
+import gymnasium as gym
 import os
 import torch
 import traceback

--- a/source/standalone/workflows/robomimic/play.py
+++ b/source/standalone/workflows/robomimic/play.py
@@ -15,7 +15,7 @@ import argparse
 from omni.isaac.orbit.app import AppLauncher
 # add argparse arguments
-parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Isaac Orbit environments.")
+parser = argparse.ArgumentParser(description="Play policy trained using robomimic for Orbit environments.")
 parser.add_argument("--cpu", action="store_true", default=False, help="Use CPU pipeline.")
 parser.add_argument("--task", type=str, default=None, help="Name of the task.")
 parser.add_argument("--checkpoint", type=str, default=None, help="Pytorch model checkpoint to load.")
@@ -31,7 +31,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback
@@ -46,7 +46,7 @@ from omni.isaac.orbit_tasks.utils import parse_env_cfg
 def main():
-    """Run a trained policy from robomimic with Isaac Orbit environment."""
+    """Run a trained policy from robomimic with Orbit environment."""
    # parse configuration
    env_cfg = parse_env_cfg(args_cli.task, use_gpu=not args_cli.cpu, num_envs=1)
    # modify configuration

--- a/source/standalone/workflows/robomimic/train.py
+++ b/source/standalone/workflows/robomimic/train.py
@@ -54,7 +54,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
 import argparse
-import gym
+import gymnasium as gym
 import json
 import numpy as np
 import os

--- a/source/standalone/workflows/rsl_rl/play.py
+++ b/source/standalone/workflows/rsl_rl/play.py
@@ -36,7 +36,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import os
 import torch
 import traceback

--- a/source/standalone/workflows/rsl_rl/train.py
+++ b/source/standalone/workflows/rsl_rl/train.py
@@ -47,7 +47,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import os
 import torch
 import traceback
@@ -88,13 +88,14 @@ def main():
    log_dir = os.path.join(log_root_path, log_dir)
    # create isaac environment
-    env = gym.make(args_cli.task, cfg=env_cfg)
+    env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
    # wrap for video recording
    if args_cli.video:
        video_kwargs = {
            "video_folder": os.path.join(log_dir, "videos"),
            "step_trigger": lambda step: step % args_cli.video_interval == 0,
            "video_length": args_cli.video_length,
+            "disable_logger": True,
        }
        print("[INFO] Recording videos during training.")
        print_dict(video_kwargs, nesting=4)

--- a/source/standalone/workflows/sb3/play.py
+++ b/source/standalone/workflows/sb3/play.py
@@ -33,7 +33,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback

--- a/source/standalone/workflows/sb3/train.py
+++ b/source/standalone/workflows/sb3/train.py
@@ -43,7 +43,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import os
 import traceback
 from datetime import datetime
@@ -95,6 +95,7 @@ def main():
            "video_folder": os.path.join(log_dir, "videos"),
            "step_trigger": lambda step: step % args_cli.video_interval == 0,
            "video_length": args_cli.video_length,
+            "disable_logger": True,
        }
        print("[INFO] Recording videos during training.")
        print_dict(video_kwargs, nesting=4)

--- a/source/standalone/workflows/skrl/play.py
+++ b/source/standalone/workflows/skrl/play.py
@@ -38,7 +38,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import torch
 import traceback

--- a/source/standalone/workflows/skrl/train.py
+++ b/source/standalone/workflows/skrl/train.py
@@ -48,7 +48,7 @@ simulation_app = app_launcher.app
 """Rest everything follows."""
-import gym
+import gymnasium as gym
 import traceback
 from datetime import datetime
@@ -97,13 +97,14 @@ def main():
    dump_pickle(os.path.join(log_dir, "params", "agent.pkl"), experiment_cfg)
    # create isaac environment
-    env = gym.make(args_cli.task, cfg=env_cfg)
+    env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
    # wrap for video recording
    if args_cli.video:
        video_kwargs = {
            "video_folder": os.path.join(log_dir, "videos"),
            "step_trigger": lambda step: step % args_cli.video_interval == 0,
            "video_length": args_cli.video_length,
+            "disable_logger": True,
        }
        print("[INFO] Recording videos during training.")
        print_dict(video_kwargs, nesting=4)