Unverified Commit 3ef7e678 authored by Mayank Mittal's avatar Mayank Mittal Committed by GitHub

Cleans up the `omni.isaac.lab.envs` submodule (#548)

# Description

Earlier, it was unclear where the configuration classes and
corresponding classes belonged inside the `omni.isaac.lab.envs` module.
This MR reorganizes the module to ensure parity between the class and
its respective configuration class. The MR also fixes docstrings with
the hope of making things cleaner.

## Type of change

- Bug fix (non-breaking change which fixes an issue)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- This change requires a documentation update

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have run all the tests with `./isaaclab.sh --test` and they pass
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
parent 59493b89
...@@ -38,7 +38,7 @@ repos: ...@@ -38,7 +38,7 @@ repos:
- id: pyupgrade - id: pyupgrade
args: ["--py310-plus"] args: ["--py310-plus"]
# FIXME: This is a hack because Pytorch does not like: torch.Tensor | dict aliasing # FIXME: This is a hack because Pytorch does not like: torch.Tensor | dict aliasing
exclude: "source/extensions/omni.isaac.lab/omni/isaac/lab/envs/types.py" exclude: "source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py"
- repo: https://github.com/codespell-project/codespell - repo: https://github.com/codespell-project/codespell
rev: v2.2.6 rev: v2.2.6
hooks: hooks:
......
...@@ -69,7 +69,7 @@ Table of Contents ...@@ -69,7 +69,7 @@ Table of Contents
:maxdepth: 2 :maxdepth: 2
:caption: Features :caption: Features
source/features/workflows source/features/task_workflows
source/features/multi_gpu source/features/multi_gpu
source/features/tiled_rendering source/features/tiled_rendering
source/features/environments source/features/environments
......
...@@ -16,11 +16,11 @@ ...@@ -16,11 +16,11 @@
ManagerBasedEnv ManagerBasedEnv
ManagerBasedEnvCfg ManagerBasedEnvCfg
ViewerCfg
ManagerBasedRLEnv ManagerBasedRLEnv
ManagerBasedRLEnvCfg ManagerBasedRLEnvCfg
DirectRLEnv DirectRLEnv
DirectRLEnvCfg DirectRLEnvCfg
ViewerCfg
Manager Based Environment Manager Based Environment
------------------------- -------------------------
...@@ -32,10 +32,6 @@ Manager Based Environment ...@@ -32,10 +32,6 @@ Manager Based Environment
:members: :members:
:exclude-members: __init__, class_type :exclude-members: __init__, class_type
.. autoclass:: ViewerCfg
:members:
:exclude-members: __init__
Manager Based RL Environment Manager Based RL Environment
---------------------------- ----------------------------
...@@ -63,3 +59,10 @@ Direct RL Environment ...@@ -63,3 +59,10 @@ Direct RL Environment
:inherited-members: :inherited-members:
:show-inheritance: :show-inheritance:
:exclude-members: __init__, class_type :exclude-members: __init__, class_type
Common
------
.. autoclass:: ViewerCfg
:members:
:exclude-members: __init__
.. _feature-workflows:
Task Design Workflows
=====================
.. currentmodule:: omni.isaac.lab
Environments define the interface between the agent and the simulation. In the simplest case, the environment provides
the agent with the current observations and executes the actions provided by the agent. In a Markov Decision Process
(MDP) formulation, the environment can also provide additional information such as the current reward, done flag, and
information about the current episode.
While the environment interface is simple to understand, its implementation can vary significantly depending on the
complexity of the task. In the context of reinforcement learning (RL), the environment implementation can be broken down
into several components, such as the reward function, observation function, termination function, and reset function.
Each of these components can be implemented in different ways depending on the complexity of the task and the desired
level of modularity.
We provide two different workflows for designing environments with the framework:
* **Manager-based**: The environment is decomposed into individual components (or managers) that handle different
aspects of the environment (such as computing observations, applying actions, and applying randomization). The
user defines configuration classes for each component and the environment is responsible for coordinating the
managers and calling their functions.
* **Direct**: The user defines a single class that implements the entire environment directly without the need for
separate managers. This class is responsible for computing observations, applying actions, and computing rewards.
Both workflows have their own advantages and disadvantages. The manager-based workflow is more modular and allows
different components of the environment to be swapped out easily. This is useful when prototyping the environment
and experimenting with different configurations. On the other hand, the direct workflow is more efficient and allows
for more fine-grained control over the environment logic. This is useful when optimizing the environment for performance
or when implementing complex logic that is difficult to decompose into separate components.
Manager-Based Environments
--------------------------
A majority of environment implementations follow a similar structure. The environment processes the input actions,
steps through the simulation, computes observations and reward signals, applies randomization, and resets the terminated
environments. Motivated by this, the environment can be decomposed into individual components that handle each of these tasks.
For example, the observation manager is responsible for computing the observations, the reward manager is responsible for
computing the rewards, and the termination manager is responsible for computing the termination signal. This approach
is known as the manager-based environment design in the framework.
Manager-based environments promote modular implementations of tasks by decomposing the task into individual
components that are managed by separate classes. Each component of the task, such as rewards, observations,
termination can all be specified as individual configuration classes that are then passed to the corresponding
manager classes. The manager is then responsible for parsing the configurations and processing the contents specified
in its configuration.
The coordination between the different managers is orchestrated by the class :class:`envs.ManagerBasedRLEnv`.
It takes in a task configuration class instance (:class:`envs.ManagerBasedRLEnvCfg`) that contains the configurations
for each of the components of the task. Based on the configurations, the scene is set up and the task is initialized.
Afterwards, while stepping through the environment, all the managers are called sequentially to perform the necessary
operations.
For their own tasks, we expect the user to mainly define the task configuration class and use the existing
:class:`envs.ManagerBasedRLEnv` class for the task implementation. The task configuration class should inherit from
the base class :class:`envs.ManagerBasedRLEnvCfg` and contain variables assigned to various configuration classes
for each component (such as the ``ObservationCfg`` and ``RewardCfg``).
.. dropdown:: Example for defining the reward function for the Cartpole task using the manager-style
:icon: plus
The following class is a part of the Cartpole environment configuration class. The :class:`RewardsCfg` class
defines individual terms that compose the reward function. Each reward term is defined by its function
implementation, weight and additional parameters to be passed to the function. Users can define multiple
reward terms and their weights to be used in the reward function.
.. literalinclude:: ../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/cartpole_env_cfg.py
:language: python
:pyobject: RewardsCfg
Through this approach, it is possible to easily vary the implementations of the task by switching some components
while leaving the remaining of the code intact. This flexibility is desirable when prototyping the environment and
experimenting with different configurations. It also allows for easy collaborating with others on implementing an
environment, since contributors may choose to use different combinations of configurations for their own task
specifications.
.. seealso::
We provide a more detailed tutorial for setting up an environment using the manager-based workflow at
:ref:`tutorial-create-manager-rl-env`.
Direct Environments
-------------------
The direct-style environment aligns more closely with traditional implementations of environments,
where a single script directly implements the reward function, observation function, resets, and all the other components
of the environment. This approach does not require the manager classes. Instead, users are provided the complete freedom
to implement their task through the APIs from the base class :class:`envs.DirectRLEnv`. For users migrating from the `IsaacGymEnvs`_
and `OmniIsaacGymEnvs`_ framework, this workflow may be more familiar.
When defining an environment with the direct-style implementation, we expect the user define a single class that
implements the entire environment. The task class should inherit from the base :class:`envs.DirectRLEnv` class and should
have its corresponding configuration class that inherits from :class:`envs.DirectRLEnvCfg`. The task class is responsible
for setting up the scene, processing the actions, computing the rewards, observations, resets, and termination signals.
.. dropdown:: Example for defining the reward function for the Cartpole task using the direct-style
:icon: plus
The following function is a part of the Cartpole environment class and is responsible for computing the rewards.
.. literalinclude:: ../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cartpole/cartpole_env.py
:language: python
:pyobject: CartpoleEnv._get_rewards
:dedent: 4
It calls the :meth:`compute_rewards` function which is Torch JIT compiled for performance benefits.
.. literalinclude:: ../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cartpole/cartpole_env.py
:language: python
:pyobject: compute_rewards
This approach provides more transparency in the implementations of the environments, as logic is defined within the task
class instead of abstracted with the use of managers. This may be beneficial when implementing complex logic that is
difficult to decompose into separate components. Additionally, the direct-style implementation may bring more performance
benefits for the environment, as it allows implementing large chunks of logic with optimized frameworks such as
`PyTorch JIT`_ or `Warp`_. This may be valuable when scaling up training tremendously which requires optimizing individual
operations in the environment.
.. seealso::
We provide a more detailed tutorial for setting up a RL environment using the direct workflow at
:ref:`tutorial-create-direct-rl-env`.
.. _IsaacGymEnvs: https://github.com/isaac-sim/IsaacGymEnvs
.. _OmniIsaacGymEnvs: https://github.com/isaac-sim/OmniIsaacGymEnvs
.. _Pytorch JIT: https://pytorch.org/docs/stable/jit.html
.. _Warp: https://github.com/NVIDIA/warp
.. _feature-workflows:
Task Design Workflows
=====================
.. currentmodule:: omni.isaac.lab
Reinforcement learning environments can be implemented using two different workflows: Manager-based and Direct.
This page outlines the two workflows, explaining their benefits and usecases.
In addition, multi-GPU and multi-node reinforcement learning support is explained, along with the tiled rendering API,
which can be used for efficient vectorized rendering across environments.
Manager-Based Environments
--------------------------
Manager-based environments promote modular implementations of reinforcement learning tasks
through the use of Managers. Each component of the task, such as rewards, observations, termination
can all be specified as individual configuration classes that are then passed to the corresponding
manager classes. Each manager is responsible for parsing the configurations and processing
the contents specified in each config class. The manager implementations are taken care of by
the base class :class:`envs.ManagerBasedRLEnv`.
With this approach, it is simple to switch implementations of some components in the task
while leaving the remaining of the code intact. This is desirable when collaborating with others
on implementing a reinforcement learning environment, where contributors may choose to use
different combinations of configurations for the reinforcement learning components of the task.
A class definition of a manager-based environment consists of defining a task configuration class that
inherits from :class:`envs.ManagerBasedRLEnvCfg`. This class should contain variables assigned to various
configuration classes for each of the components of the RL task, such as the ``ObservationCfg``
or ``RewardCfg``. The entry point of the environment becomes the base class :class:`envs.ManagerBasedRLEnv`,
which will process the main task config and iterate through the individual configuration classes that are defined
in the task config class.
An example of implementing the reward function for the Cartpole task using the manager-based implementation is as follow:
.. code-block:: python
@configclass
class RewardsCfg:
"""Reward terms for the MDP."""
# (1) Constant running reward
alive = RewTerm(func=mdp.is_alive, weight=1.0)
# (2) Failure penalty
terminating = RewTerm(func=mdp.is_terminated, weight=-2.0)
# (3) Primary task: keep pole upright
pole_pos = RewTerm(
func=mdp.joint_pos_target_l2,
weight=-1.0,
params={"asset_cfg": SceneEntityCfg("robot", joint_names=["cart_to_pole"]), "target": 0.0},
)
# (4) Shaping tasks: lower cart velocity
cart_vel = RewTerm(
func=mdp.joint_vel_l1,
weight=-0.01,
params={"asset_cfg": SceneEntityCfg("robot", joint_names=["slider_to_cart"])},
)
# (5) Shaping tasks: lower pole angular velocity
pole_vel = RewTerm(
func=mdp.joint_vel_l1,
weight=-0.005,
params={"asset_cfg": SceneEntityCfg("robot", joint_names=["cart_to_pole"])},
)
.. seealso::
We provide a more detailed tutorial for setting up a RL environment using the manager-based workflow at
`Creating a manager-based RL Environment <../tutorials/03_envs/create_rl_env.html>`_.
Direct Environments
-------------------
The direct-style environment more closely aligns with traditional implementations of reinforcement learning environments,
where a single script implements the reward function, observation function, resets, and all other components
of the environment. This approach does not use the Manager classes. Instead, users are left with the freedom
to implement the APIs from the base class :class:`envs.DirectRLEnv`. For users migrating from the IsaacGymEnvs
or OmniIsaacGymEnvs framework, this workflow will have a closer implementation to the previous frameworks.
When defining an environment following the direct-style implementation, a task configuration class inheriting from
:class:`envs.DirectRLEnvCfg` is used for defining task environment configuration variables, such as the number
of observations and actions. Adding configuration classes for the managers are not required and will not be processed
by the base class. In addition to the configuration class, the logic of the task should be defined in a new
task class that inherits from the base class :class:`envs.DirectRLEnv`. This class will then implement the main
task logics, including setting up the scene, processing the actions, computing resets, rewards, and observations.
This approach may bring more performance benefits for the environment, as it allows implementing large chunks
of logic with optimized frameworks such as `PyTorch Jit <https://pytorch.org/docs/stable/jit.html>`_ or
`Warp <https://github.com/NVIDIA/warp>`_. This may be important when scaling up training for large and complex
environments. Additionally, data may be cached in class variables and reused in multiple APIs for the class.
This method provides more transparency in the implementations of the environments, as logic is defined
within the task class instead of abstracted with the use the Managers.
An example of implementing the reward function for the Cartpole task using the Direct-style implementation is as follow:
.. code-block:: python
def _get_rewards(self) -> torch.Tensor:
total_reward = compute_rewards(
self.cfg.rew_scale_alive,
self.cfg.rew_scale_terminated,
self.cfg.rew_scale_pole_pos,
self.cfg.rew_scale_cart_vel,
self.cfg.rew_scale_pole_vel,
self.joint_pos[:, self._pole_dof_idx[0]],
self.joint_vel[:, self._pole_dof_idx[0]],
self.joint_pos[:, self._cart_dof_idx[0]],
self.joint_vel[:, self._cart_dof_idx[0]],
self.reset_terminated,
)
return total_reward
@torch.jit.script
def compute_rewards(
rew_scale_alive: float,
rew_scale_terminated: float,
rew_scale_pole_pos: float,
rew_scale_cart_vel: float,
rew_scale_pole_vel: float,
pole_pos: torch.Tensor,
pole_vel: torch.Tensor,
cart_pos: torch.Tensor,
cart_vel: torch.Tensor,
reset_terminated: torch.Tensor,
):
rew_alive = rew_scale_alive * (1.0 - reset_terminated.float())
rew_termination = rew_scale_terminated * reset_terminated.float()
rew_pole_pos = rew_scale_pole_pos * torch.sum(torch.square(pole_pos), dim=-1)
rew_cart_vel = rew_scale_cart_vel * torch.sum(torch.abs(cart_vel), dim=-1)
rew_pole_vel = rew_scale_pole_vel * torch.sum(torch.abs(pole_vel), dim=-1)
total_reward = rew_alive + rew_termination + rew_pole_pos + rew_cart_vel + rew_pole_vel
return total_reward
.. seealso::
We provide a more detailed tutorial for setting up a RL environment using the direct workflow at
`Creating a Direct Workflow RL Environment <../tutorials/03_envs/create_direct_rl_env.html>`_.
...@@ -77,7 +77,7 @@ The camera's pose and image resolution can be configured through the ...@@ -77,7 +77,7 @@ The camera's pose and image resolution can be configured through the
.. dropdown:: Default parameters of the ViewerCfg class: .. dropdown:: Default parameters of the ViewerCfg class:
:icon: code :icon: code
.. literalinclude:: ../../../source/extensions/omni.isaac.lab/omni/isaac/lab/envs/base_env_cfg.py .. literalinclude:: ../../../source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py
:language: python :language: python
:pyobject: ViewerCfg :pyobject: ViewerCfg
......
.. _tutorial-create-oige-rl-env: .. _tutorial-create-direct-rl-env:
Creating a Direct Workflow RL Environment Creating a Direct Workflow RL Environment
...@@ -103,7 +103,7 @@ Defining Rewards ...@@ -103,7 +103,7 @@ Defining Rewards
Reward function should be defined in the ``_get_rewards(self)`` API, which returns the reward Reward function should be defined in the ``_get_rewards(self)`` API, which returns the reward
buffer as a return value. Within this function, the task is free to implement the logic of buffer as a return value. Within this function, the task is free to implement the logic of
the reward function. In this example, we implement a Pytorch jitted function that computes the reward function. In this example, we implement a Pytorch JIT function that computes
the various components of the reward function. the various components of the reward function.
.. code-block:: python .. code-block:: python
......
.. _tutorial-create-base-env: .. _tutorial-create-manager-base-env:
Creating a Manager-Based Base Environment Creating a Manager-Based Base Environment
......
.. _tutorial-create-rl-env: .. _tutorial-create-manager-rl-env:
Creating a Manager-Based RL Environment Creating a Manager-Based RL Environment
...@@ -6,7 +6,7 @@ Creating a Manager-Based RL Environment ...@@ -6,7 +6,7 @@ Creating a Manager-Based RL Environment
.. currentmodule:: omni.isaac.lab .. currentmodule:: omni.isaac.lab
Having learnt how to create a base environment in :ref:`tutorial-create-base-env`, we will now look at how to create a manager-based Having learnt how to create a base environment in :ref:`tutorial-create-manager-base-env`, we will now look at how to create a manager-based
task environment for reinforcement learning. task environment for reinforcement learning.
The base environment is designed as an sense-act environment where the agent can send commands to the environment The base environment is designed as an sense-act environment where the agent can send commands to the environment
...@@ -56,7 +56,7 @@ The script for running the environment ``run_cartpole_rl_env.py`` is present in ...@@ -56,7 +56,7 @@ The script for running the environment ``run_cartpole_rl_env.py`` is present in
The Code Explained The Code Explained
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
We already went through parts of the above in the :ref:`tutorial-create-base-env` tutorial to learn We already went through parts of the above in the :ref:`tutorial-create-manager-base-env` tutorial to learn
about how to specify the scene, observations, actions and events. Thus, in this tutorial, we about how to specify the scene, observations, actions and events. Thus, in this tutorial, we
will focus only on the RL components of the environment. will focus only on the RL components of the environment.
...@@ -144,7 +144,7 @@ Tying it all together ...@@ -144,7 +144,7 @@ Tying it all together
--------------------- ---------------------
With all the above components defined, we can now create the :class:`ManagerBasedRLEnvCfg` configuration for the With all the above components defined, we can now create the :class:`ManagerBasedRLEnvCfg` configuration for the
cartpole environment. This is similar to the :class:`ManagerBasedEnvCfg` defined in :ref:`tutorial-create-base-env`, cartpole environment. This is similar to the :class:`ManagerBasedEnvCfg` defined in :ref:`tutorial-create-manager-base-env`,
only with the added RL components explained in the above sections. only with the added RL components explained in the above sections.
.. literalinclude:: ../../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/cartpole_env_cfg.py .. literalinclude:: ../../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/cartpole_env_cfg.py
......
...@@ -10,8 +10,8 @@ different aspects of the framework to create a simulation environment for agent ...@@ -10,8 +10,8 @@ different aspects of the framework to create a simulation environment for agent
:maxdepth: 1 :maxdepth: 1
:titlesonly: :titlesonly:
create_base_env create_manager_base_env
create_rl_env create_manager_rl_env
create_direct_rl_env create_direct_rl_env
register_rl_env_gym register_rl_env_gym
run_rl_training run_rl_training
...@@ -63,7 +63,7 @@ in the environment name, the entry point to the environment class, and the entry ...@@ -63,7 +63,7 @@ in the environment name, the entry point to the environment class, and the entry
environment configuration class. environment configuration class.
.. note:: .. note::
The ``gymnasium`` registry is a global registry. Hence, it is important to ensure that the The :mod:`gymnasium` registry is a global registry. Hence, it is important to ensure that the
environment names are unique. Otherwise, the registry will throw an error when registering environment names are unique. Otherwise, the registry will throw an error when registering
the environment. the environment.
...@@ -76,7 +76,7 @@ call for the cartpole environment in the ``omni.isaac.lab_tasks.manager_based.cl ...@@ -76,7 +76,7 @@ call for the cartpole environment in the ``omni.isaac.lab_tasks.manager_based.cl
.. literalinclude:: ../../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/__init__.py .. literalinclude:: ../../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/classic/cartpole/__init__.py
:language: python :language: python
:lines: 10- :lines: 10-
:emphasize-lines: 11, 12, 15 :emphasize-lines: 4, 11, 12, 15
The ``id`` argument is the name of the environment. As a convention, we name all the environments The ``id`` argument is the name of the environment. As a convention, we name all the environments
with the prefix ``Isaac-`` to make it easier to search for them in the registry. The name of the with the prefix ``Isaac-`` to make it easier to search for them in the registry. The name of the
...@@ -96,54 +96,22 @@ configuration is loaded using the :meth:`omni.isaac.lab_tasks.utils.parse_env_cf ...@@ -96,54 +96,22 @@ configuration is loaded using the :meth:`omni.isaac.lab_tasks.utils.parse_env_cf
It is then passed to the :meth:`gymnasium.make` function to create the environment instance. It is then passed to the :meth:`gymnasium.make` function to create the environment instance.
The configuration entry point can be both a YAML file or a python configuration class. The configuration entry point can be both a YAML file or a python configuration class.
Direct Environemtns Direct Environments
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
For direct-based environments, the following shows the registration call for the cartpole environment For direct-based environments, the environment registration follows a similar pattern. Instead of
in the ``omni.isaac.lab_tasks.direct.cartpole`` sub-package: registering the environment's entry point as the :class:`~omni.isaac.lab.envs.ManagerBasedRLEnv` class,
we register the environment's entry point as the implementation class of the environment.
Additionally, we add the suffix ``-Direct`` to the environment name to differentiate it from the
manager-based environments.
.. code-block:: python As an example, the following shows the registration call for the cartpole environment in the
``omni.isaac.lab_tasks.direct.cartpole`` sub-package:
import gymnasium as gym .. literalinclude:: ../../../../source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cartpole/__init__.py
:language: python
from . import agents :lines: 10-31
from .cartpole_env import CartpoleEnv, CartpoleEnvCfg :emphasize-lines: 5, 12, 13, 16
##
# Register Gym environments.
##
gym.register(
id="Isaac-Cartpole-Direct-v0",
entry_point="omni.isaac.lab_tasks.direct.cartpole:CartpoleEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": CartpoleEnvCfg,
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
"rsl_rl_cfg_entry_point": agents.rsl_rl_ppo_cfg.CartpolePPORunnerCfg,
"skrl_cfg_entry_point": f"{agents.__name__}:skrl_ppo_cfg.yaml",
"sb3_cfg_entry_point": f"{agents.__name__}:sb3_ppo_cfg.yaml",
},
)
The ``id`` argument is the name of the environment. As a convention, we name all the environments
with the prefix ``Isaac-`` to make it easier to search for them in the registry.
For direct environments, we also add the suffix ``-Direct``. The name of the
environment is typically followed by the name of the task, and then the name of the robot.
For instance, for legged locomotion with ANYmal C on flat terrain, the environment is called
``Isaac-Velocity-Flat-Anymal-C-Direct-v0``. The version number ``v<N>`` is typically used to specify different
variations of the same environment. Otherwise, the names of the environments can become too long
and difficult to read.
The ``entry_point`` argument is the entry point to the environment class. The entry point is a string
of the form ``<module>:<class>``. In the case of the cartpole environment, the entry point is
``omni.isaac.lab_tasks.direct.cartpole:CartpoleEnv``. The entry point is used to import the environment class
when creating the environment instance.
The ``env_cfg_entry_point`` argument specifies the default configuration for the environment. The default
configuration is loaded using the :meth:`omni.isaac.lab_tasks.utils.parse_env_cfg` function.
It is then passed to the :meth:`gymnasium.make` function to create the environment instance.
The configuration entry point can be both a YAML file or a python configuration class.
Creating the environment Creating the environment
...@@ -181,7 +149,7 @@ Now that we have gone through the code, let's run the script and see the result: ...@@ -181,7 +149,7 @@ Now that we have gone through the code, let's run the script and see the result:
./isaaclab.sh -p source/standalone/environments/random_agent.py --task Isaac-Cartpole-v0 --num_envs 32 ./isaaclab.sh -p source/standalone/environments/random_agent.py --task Isaac-Cartpole-v0 --num_envs 32
This should open a stage with everything similar to the previous :ref:`tutorial-create-rl-env` tutorial. This should open a stage with everything similar to the :ref:`tutorial-create-manager-rl-env` tutorial.
To stop the simulation, you can either close the window, or press ``Ctrl+C`` in the terminal. To stop the simulation, you can either close the window, or press ``Ctrl+C`` in the terminal.
In addition, you can also change the simulation device from GPU to CPU by adding the ``--cpu`` flag: In addition, you can also change the simulation device from GPU to CPU by adding the ``--cpu`` flag:
......
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.18.0" version = "0.18.1"
# Description # Description
title = "Isaac Lab framework for Robot Learning" title = "Isaac Lab framework for Robot Learning"
......
Changelog Changelog
--------- ---------
0.18.1 (2024-06-25)
~~~~~~~~~~~~~~~~~~~
Changed
^^^^^^^
* Ensured that a parity between class and its configuration class is explicitly visible in the :class:`omni.isaac.lab.envs`
module. This makes it easier to follow where definitions are located and how they are related. This should not be
a breaking change as the classes are still accessible through the same module.
0.18.0 (2024-06-13) 0.18.0 (2024-06-13)
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~
Fixed Fixed
^^^^^ ^^^^^
...@@ -23,8 +34,8 @@ Changed ...@@ -23,8 +34,8 @@ Changed
Fixed Fixed
^^^^^ ^^^^^
* Fixed the orientation reset logic in :func:`omni.isaac.lab.envs.mdp.events.reset_root_state_uniform` to make it relative to the default orientation. * Fixed the orientation reset logic in :func:`omni.isaac.lab.envs.mdp.events.reset_root_state_uniform` to make it relative to
Earlier, the position was sampled relative to the default and the orientation not. the default orientation. Earlier, the position was sampled relative to the default and the orientation not.
0.17.12 (2024-06-13) 0.17.12 (2024-06-13)
......
...@@ -11,28 +11,35 @@ observations and executes the actions provided by the agent. However, the ...@@ -11,28 +11,35 @@ observations and executes the actions provided by the agent. However, the
environment can also provide additional information such as the current environment can also provide additional information such as the current
reward, done flag, and information about the current episode. reward, done flag, and information about the current episode.
Based on these, there are two types of environments: There are two types of environment designing workflows:
* :class:`ManagerBasedEnv`: The manager-based workflow base environment which * **Manager-based**: The environment is decomposed into individual components (or managers)
only provides the agent with the for different aspects (such as computing observations, applying actions, and applying
current observations and executes the actions provided by the agent. randomization. The users mainly configure the managers and the environment coordinates the
* :class:`ManagerBasedRLEnv`: The manager-based workflow RL task environment which managers and calls their functions.
besides the functionality of * **Direct**: The user implements all the necessary functionality directly into a single class
the base environment also provides additional Markov Decision Process (MDP) directly without the need for additional managers.
related information such as the current reward, done flag, and information.
In addition, RL task environments can use the direct workflow implementation: Based on these workflows, there are the following environment classes:
* :class:`ManagerBasedEnv`: The manager-based workflow base environment which only provides the
agent with the current observations and executes the actions provided by the agent.
* :class:`ManagerBasedRLEnv`: The manager-based workflow RL task environment which besides the
functionality of the base environment also provides additional Markov Decision Process (MDP)
related information such as the current reward, done flag, and information.
* :class:`DirectRLEnv`: The direct workflow RL task environment which provides implementations for
implementing scene setup, computing dones, performing resets, and computing reward and observation.
* :class:`DirectRLEnv`: The direct workflow RL task environment which provides implementations For more information about the workflow design patterns, see the `Task Design Workflows`_ section.
for implementing scene setup, computing dones, performing resets, and computing
reward and observation.
.. _`Task Design Workflows`: https://isaac-sim.github.io/IsaacLab/source/features/task_workflows.html
""" """
from . import mdp, ui from . import mdp, ui
from .base_env_cfg import ManagerBasedEnvCfg, ViewerCfg from .common import VecEnvObs, VecEnvStepReturn, ViewerCfg
from .direct_rl_env import DirectRLEnv from .direct_rl_env import DirectRLEnv
from .direct_rl_env_cfg import DirectRLEnvCfg
from .manager_based_env import ManagerBasedEnv from .manager_based_env import ManagerBasedEnv
from .manager_based_env_cfg import ManagerBasedEnvCfg
from .manager_based_rl_env import ManagerBasedRLEnv from .manager_based_rl_env import ManagerBasedRLEnv
from .rl_env_cfg import DirectRLEnvCfg, ManagerBasedRLEnvCfg from .manager_based_rl_env_cfg import ManagerBasedRLEnvCfg
from .types import VecEnvObs, VecEnvStepReturn
...@@ -6,7 +6,61 @@ ...@@ -6,7 +6,61 @@
from __future__ import annotations from __future__ import annotations
import torch import torch
from typing import Dict from typing import Dict, Literal
from omni.isaac.lab.utils import configclass
##
# Configuration.
##
@configclass
class ViewerCfg:
"""Configuration of the scene viewport camera."""
eye: tuple[float, float, float] = (7.5, 7.5, 7.5)
"""Initial camera position (in m). Default is (7.5, 7.5, 7.5)."""
lookat: tuple[float, float, float] = (0.0, 0.0, 0.0)
"""Initial camera target position (in m). Default is (0.0, 0.0, 0.0)."""
cam_prim_path: str = "/OmniverseKit_Persp"
"""The camera prim path to record images from. Default is "/OmniverseKit_Persp",
which is the default camera in the viewport.
"""
resolution: tuple[int, int] = (1280, 720)
"""The resolution (width, height) of the camera specified using :attr:`cam_prim_path`.
Default is (1280, 720).
"""
origin_type: Literal["world", "env", "asset_root"] = "world"
"""The frame in which the camera position (eye) and target (lookat) are defined in. Default is "world".
Available options are:
* ``"world"``: The origin of the world.
* ``"env"``: The origin of the environment defined by :attr:`env_index`.
* ``"asset_root"``: The center of the asset defined by :attr:`asset_name` in environment :attr:`env_index`.
"""
env_index: int = 0
"""The environment index for frame origin. Default is 0.
This quantity is only effective if :attr:`origin` is set to "env" or "asset_root".
"""
asset_name: str | None = None
"""The asset name in the interactive scene for the frame origin. Default is None.
This quantity is only effective if :attr:`origin` is set to "asset_root".
"""
##
# Types.
##
VecEnvObs = Dict[str, torch.Tensor | Dict[str, torch.Tensor]] VecEnvObs = Dict[str, torch.Tensor | Dict[str, torch.Tensor]]
"""Observation returned by the environment. """Observation returned by the environment.
...@@ -31,7 +85,7 @@ Note: ...@@ -31,7 +85,7 @@ Note:
""" """
VecEnvStepReturn = tuple[VecEnvObs, torch.Tensor, torch.Tensor, torch.Tensor, Dict] VecEnvStepReturn = tuple[VecEnvObs, torch.Tensor, torch.Tensor, torch.Tensor, dict]
"""The environment signals processed at the end of each step. """The environment signals processed at the end of each step.
The tuple contains batched information for each sub-environment. The information is stored in the following order: The tuple contains batched information for each sub-environment. The information is stored in the following order:
......
...@@ -8,88 +8,23 @@ from dataclasses import MISSING ...@@ -8,88 +8,23 @@ from dataclasses import MISSING
from omni.isaac.lab.scene import InteractiveSceneCfg from omni.isaac.lab.scene import InteractiveSceneCfg
from omni.isaac.lab.sim import SimulationCfg from omni.isaac.lab.sim import SimulationCfg
from omni.isaac.lab.utils import configclass from omni.isaac.lab.utils import configclass
from omni.isaac.lab.utils.noise.noise_cfg import NoiseModelCfg from omni.isaac.lab.utils.noise import NoiseModelCfg
from .base_env_cfg import ManagerBasedEnvCfg, ViewerCfg from .common import ViewerCfg
from .ui import BaseEnvWindow, ManagerBasedRLEnvWindow from .ui import BaseEnvWindow
@configclass @configclass
class ManagerBasedRLEnvCfg(ManagerBasedEnvCfg): class DirectRLEnvCfg:
"""Configuration for a reinforcement learning environment with the manager-based workflow.""" """Configuration for an RL environment defined with the direct workflow.
# ui settings Please refer to the :class:`omni.isaac.lab.envs.direct_rl_env.DirectRLEnv` class for more details.
ui_window_class_type: type | None = ManagerBasedRLEnvWindow
# general settings
is_finite_horizon: bool = False
"""Whether the learning task is treated as a finite or infinite horizon problem for the agent.
Defaults to False, which means the task is treated as an infinite horizon problem.
This flag handles the subtleties of finite and infinite horizon tasks:
* **Finite horizon**: no penalty or bootstrapping value is required by the the agent for
running out of time. However, the environment still needs to terminate the episode after the
time limit is reached.
* **Infinite horizon**: the agent needs to bootstrap the value of the state at the end of the episode.
This is done by sending a time-limit (or truncated) done signal to the agent, which triggers this
bootstrapping calculation.
If True, then the environment is treated as a finite horizon problem and no time-out (or truncated) done signal
is sent to the agent. If False, then the environment is treated as an infinite horizon problem and a time-out
(or truncated) done signal is sent to the agent.
Note:
The base :class:`ManagerBasedRLEnv` class does not use this flag directly. It is used by the environment
wrappers to determine what type of done signal to send to the corresponding learning agent.
"""
episode_length_s: float = MISSING
"""Duration of an episode (in seconds).
Based on the decimation rate and physics time step, the episode length is calculated as:
.. code-block:: python
episode_length_steps = ceil(episode_length_s / (decimation_rate * physics_time_step))
For example, if the decimation rate is 10, the physics time step is 0.01, and the episode length is 10 seconds,
then the episode length in steps is 100.
"""
# environment settings
rewards: object = MISSING
"""Reward settings.
Please refer to the :class:`omni.isaac.lab.managers.RewardManager` class for more details.
""" """
terminations: object = MISSING
"""Termination settings.
Please refer to the :class:`omni.isaac.lab.managers.TerminationManager` class for more details.
"""
curriculum: object = MISSING
"""Curriculum settings.
Please refer to the :class:`omni.isaac.lab.managers.CurriculumManager` class for more details.
"""
commands: object = MISSING
"""Command settings.
Please refer to the :class:`omni.isaac.lab.managers.CommandManager` class for more details.
"""
@configclass
class DirectRLEnvCfg(ManagerBasedEnvCfg):
"""Configuration for a reinforcement learning environment with the direct workflow."""
# simulation settings # simulation settings
viewer: ViewerCfg = ViewerCfg() viewer: ViewerCfg = ViewerCfg()
"""Viewer configuration. Default is ViewerCfg().""" """Viewer configuration. Default is ViewerCfg()."""
sim: SimulationCfg = SimulationCfg() sim: SimulationCfg = SimulationCfg()
"""Physics simulation configuration. Default is SimulationCfg().""" """Physics simulation configuration. Default is SimulationCfg()."""
...@@ -113,14 +48,6 @@ class DirectRLEnvCfg(ManagerBasedEnvCfg): ...@@ -113,14 +48,6 @@ class DirectRLEnvCfg(ManagerBasedEnvCfg):
This means that the control action is updated every 10 simulation steps. This means that the control action is updated every 10 simulation steps.
""" """
# environment settings
scene: InteractiveSceneCfg = MISSING
"""Scene settings.
Please refer to the :class:`omni.isaac.lab.scene.InteractiveSceneCfg` class for more details.
"""
# general settings
is_finite_horizon: bool = False is_finite_horizon: bool = False
"""Whether the learning task is treated as a finite or infinite horizon problem for the agent. """Whether the learning task is treated as a finite or infinite horizon problem for the agent.
Defaults to False, which means the task is treated as an infinite horizon problem. Defaults to False, which means the task is treated as an infinite horizon problem.
...@@ -156,29 +83,39 @@ class DirectRLEnvCfg(ManagerBasedEnvCfg): ...@@ -156,29 +83,39 @@ class DirectRLEnvCfg(ManagerBasedEnvCfg):
then the episode length in steps is 100. then the episode length in steps is 100.
""" """
# environment settings
scene: InteractiveSceneCfg = MISSING
"""Scene settings.
Please refer to the :class:`omni.isaac.lab.scene.InteractiveSceneCfg` class for more details.
"""
events: object = None
"""Event settings. Defaults to None, in which case no events are applied through the event manager.
Please refer to the :class:`omni.isaac.lab.managers.EventManager` class for more details.
"""
num_observations: int = MISSING num_observations: int = MISSING
"""The size of the observation for each environment.""" """The dimension of the observation space from each environment instance."""
num_states: int = 0 num_states: int = 0
"""The size of the state-space for each environment. Default is 0. """The dimension of the state-space from each environment instance. Default is 0, which means no state-space is defined.
This is used for asymmetric actor-critic and defines the observation space for the critic. This is useful for asymmetric actor-critic and defines the observation space for the critic.
""" """
num_actions: int = MISSING observation_noise_model: NoiseModelCfg | None = None
"""The size of the action space for each environment.""" """The noise model to apply to the computed observations from the environment. Default is None, which means no noise is added.
events: object = None Please refer to the :class:`omni.isaac.lab.utils.noise.NoiseModel` class for more details.
"""Settings for specifying domain randomization terms during training.
Please refer to the :class:`omni.isaac.lab.managers.EventManager` class for more details.
""" """
num_actions: int = MISSING
"""The dimension of the action space for each environment."""
action_noise_model: NoiseModelCfg | None = None action_noise_model: NoiseModelCfg | None = None
"""Settings for adding noise to the action buffer. """The noise model applied to the actions provided to the environment. Default is None, which means no noise is added.
Please refer to the :class:`omni.isaac.lab.utils.noise.NoiseModel` class for more details.
"""
observation_noise_model: NoiseModelCfg | None = None Please refer to the :class:`omni.isaac.lab.utils.noise.NoiseModel` class for more details.
"""Settings for adding noise to the observation buffer.
Please refer to the :class:`omni.isaac.lab.utils.noise.NoiseModel` class for more details.
""" """
...@@ -12,13 +12,13 @@ from typing import Any ...@@ -12,13 +12,13 @@ from typing import Any
import carb import carb
import omni.isaac.core.utils.torch as torch_utils import omni.isaac.core.utils.torch as torch_utils
from omni.isaac.lab.envs.types import VecEnvObs
from omni.isaac.lab.managers import ActionManager, EventManager, ObservationManager from omni.isaac.lab.managers import ActionManager, EventManager, ObservationManager
from omni.isaac.lab.scene import InteractiveScene from omni.isaac.lab.scene import InteractiveScene
from omni.isaac.lab.sim import SimulationContext from omni.isaac.lab.sim import SimulationContext
from omni.isaac.lab.utils.timer import Timer from omni.isaac.lab.utils.timer import Timer
from .base_env_cfg import ManagerBasedEnvCfg from .common import VecEnvObs
from .manager_based_env_cfg import ManagerBasedEnvCfg
from .ui import ViewportCameraController from .ui import ViewportCameraController
...@@ -251,7 +251,7 @@ class ManagerBasedEnv: ...@@ -251,7 +251,7 @@ class ManagerBasedEnv:
The environment steps forward at a fixed time-step, while the physics simulation is The environment steps forward at a fixed time-step, while the physics simulation is
decimated at a lower time-step. This is to ensure that the simulation is stable. These two decimated at a lower time-step. This is to ensure that the simulation is stable. These two
time-steps can be configured independently using the :attr:`ManagerBasedEnvCfg.decimation` (number of time-steps can be configured independently using the :attr:`ManagerBasedEnvCfg.decimation` (number of
simulation steps per environment step) and the :attr:`ManagerBasedEnvCfg.physics_dt` (physics time-step). simulation steps per environment step) and the :attr:`ManagerBasedEnvCfg.sim.dt` (physics time-step).
Based on these parameters, the environment time-step is computed as the product of the two. Based on these parameters, the environment time-step is computed as the product of the two.
Args: Args:
......
...@@ -10,7 +10,6 @@ configuring the environment instances, viewer settings, and simulation parameter ...@@ -10,7 +10,6 @@ configuring the environment instances, viewer settings, and simulation parameter
""" """
from dataclasses import MISSING from dataclasses import MISSING
from typing import Literal
import omni.isaac.lab.envs.mdp as mdp import omni.isaac.lab.envs.mdp as mdp
from omni.isaac.lab.managers import EventTermCfg as EventTerm from omni.isaac.lab.managers import EventTermCfg as EventTerm
...@@ -18,52 +17,10 @@ from omni.isaac.lab.scene import InteractiveSceneCfg ...@@ -18,52 +17,10 @@ from omni.isaac.lab.scene import InteractiveSceneCfg
from omni.isaac.lab.sim import SimulationCfg from omni.isaac.lab.sim import SimulationCfg
from omni.isaac.lab.utils import configclass from omni.isaac.lab.utils import configclass
from .common import ViewerCfg
from .ui import BaseEnvWindow from .ui import BaseEnvWindow
@configclass
class ViewerCfg:
"""Configuration of the scene viewport camera."""
eye: tuple[float, float, float] = (7.5, 7.5, 7.5)
"""Initial camera position (in m). Default is (7.5, 7.5, 7.5)."""
lookat: tuple[float, float, float] = (0.0, 0.0, 0.0)
"""Initial camera target position (in m). Default is (0.0, 0.0, 0.0)."""
cam_prim_path: str = "/OmniverseKit_Persp"
"""The camera prim path to record images from. Default is "/OmniverseKit_Persp",
which is the default camera in the viewport.
"""
resolution: tuple[int, int] = (1280, 720)
"""The resolution (width, height) of the camera specified using :attr:`cam_prim_path`.
Default is (1280, 720).
"""
origin_type: Literal["world", "env", "asset_root"] = "world"
"""The frame in which the camera position (eye) and target (lookat) are defined in. Default is "world".
Available options are:
* ``"world"``: The origin of the world.
* ``"env"``: The origin of the environment defined by :attr:`env_index`.
* ``"asset_root"``: The center of the asset defined by :attr:`asset_name` in environment :attr:`env_index`.
"""
env_index: int = 0
"""The environment index for frame origin. Default is 0.
This quantity is only effective if :attr:`origin` is set to "env" or "asset_root".
"""
asset_name: str | None = None
"""The asset name in the interactive scene for the frame origin. Default is None.
This quantity is only effective if :attr:`origin` is set to "asset_root".
"""
@configclass @configclass
class DefaultEventManagerCfg: class DefaultEventManagerCfg:
"""Configuration of the default event manager. """Configuration of the default event manager.
...@@ -82,6 +39,7 @@ class ManagerBasedEnvCfg: ...@@ -82,6 +39,7 @@ class ManagerBasedEnvCfg:
# simulation settings # simulation settings
viewer: ViewerCfg = ViewerCfg() viewer: ViewerCfg = ViewerCfg()
"""Viewer configuration. Default is ViewerCfg().""" """Viewer configuration. Default is ViewerCfg()."""
sim: SimulationCfg = SimulationCfg() sim: SimulationCfg = SimulationCfg()
"""Physics simulation configuration. Default is SimulationCfg().""" """Physics simulation configuration. Default is SimulationCfg()."""
......
...@@ -17,9 +17,9 @@ from omni.isaac.version import get_version ...@@ -17,9 +17,9 @@ from omni.isaac.version import get_version
from omni.isaac.lab.managers import CommandManager, CurriculumManager, RewardManager, TerminationManager from omni.isaac.lab.managers import CommandManager, CurriculumManager, RewardManager, TerminationManager
from .common import VecEnvStepReturn
from .manager_based_env import ManagerBasedEnv from .manager_based_env import ManagerBasedEnv
from .rl_env_cfg import ManagerBasedRLEnvCfg from .manager_based_rl_env_cfg import ManagerBasedRLEnvCfg
from .types import VecEnvStepReturn
class ManagerBasedRLEnv(ManagerBasedEnv, gym.Env): class ManagerBasedRLEnv(ManagerBasedEnv, gym.Env):
...@@ -84,9 +84,11 @@ class ManagerBasedRLEnv(ManagerBasedEnv, gym.Env): ...@@ -84,9 +84,11 @@ class ManagerBasedRLEnv(ManagerBasedEnv, gym.Env):
# setup the action and observation spaces for Gym # setup the action and observation spaces for Gym
self._configure_gym_env_spaces() self._configure_gym_env_spaces()
# perform events at the start of the simulation # perform events at the start of the simulation
if "startup" in self.event_manager.available_modes: if "startup" in self.event_manager.available_modes:
self.event_manager.apply(mode="startup") self.event_manager.apply(mode="startup")
# print the environment information # print the environment information
print("[INFO]: Completed setting up the environment...") print("[INFO]: Completed setting up the environment...")
......
# Copyright (c) 2022-2024, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from dataclasses import MISSING
from omni.isaac.lab.utils import configclass
from .manager_based_env_cfg import ManagerBasedEnvCfg
from .ui import ManagerBasedRLEnvWindow
@configclass
class ManagerBasedRLEnvCfg(ManagerBasedEnvCfg):
"""Configuration for a reinforcement learning environment with the manager-based workflow."""
# ui settings
ui_window_class_type: type | None = ManagerBasedRLEnvWindow
# general settings
is_finite_horizon: bool = False
"""Whether the learning task is treated as a finite or infinite horizon problem for the agent.
Defaults to False, which means the task is treated as an infinite horizon problem.
This flag handles the subtleties of finite and infinite horizon tasks:
* **Finite horizon**: no penalty or bootstrapping value is required by the the agent for
running out of time. However, the environment still needs to terminate the episode after the
time limit is reached.
* **Infinite horizon**: the agent needs to bootstrap the value of the state at the end of the episode.
This is done by sending a time-limit (or truncated) done signal to the agent, which triggers this
bootstrapping calculation.
If True, then the environment is treated as a finite horizon problem and no time-out (or truncated) done signal
is sent to the agent. If False, then the environment is treated as an infinite horizon problem and a time-out
(or truncated) done signal is sent to the agent.
Note:
The base :class:`ManagerBasedRLEnv` class does not use this flag directly. It is used by the environment
wrappers to determine what type of done signal to send to the corresponding learning agent.
"""
episode_length_s: float = MISSING
"""Duration of an episode (in seconds).
Based on the decimation rate and physics time step, the episode length is calculated as:
.. code-block:: python
episode_length_steps = ceil(episode_length_s / (decimation_rate * physics_time_step))
For example, if the decimation rate is 10, the physics time step is 0.01, and the episode length is 10 seconds,
then the episode length in steps is 100.
"""
# environment settings
rewards: object = MISSING
"""Reward settings.
Please refer to the :class:`omni.isaac.lab.managers.RewardManager` class for more details.
"""
terminations: object = MISSING
"""Termination settings.
Please refer to the :class:`omni.isaac.lab.managers.TerminationManager` class for more details.
"""
curriculum: object = MISSING
"""Curriculum settings.
Please refer to the :class:`omni.isaac.lab.managers.CurriculumManager` class for more details.
"""
commands: object = MISSING
"""Command settings.
Please refer to the :class:`omni.isaac.lab.managers.CommandManager` class for more details.
"""
...@@ -46,7 +46,7 @@ Added ...@@ -46,7 +46,7 @@ Added
Changed Changed
^^^^^^^ ^^^^^^^
* Set default device for RSL RL and SB3 configs to "cuda:0". * Made default device for RSL RL and SB3 configs to "cuda:0".
0.7.3 (2024-05-21) 0.7.3 (2024-05-21)
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
...@@ -54,7 +54,7 @@ Changed ...@@ -54,7 +54,7 @@ Changed
Added Added
^^^^^ ^^^^^
* Introduce ``--max_iterations`` argument to training scripts for specifying number of training iterations. * Introduced ``--max_iterations`` argument to training scripts for specifying number of training iterations.
0.7.2 (2024-05-13) 0.7.2 (2024-05-13)
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
...@@ -62,7 +62,8 @@ Added ...@@ -62,7 +62,8 @@ Added
Added Added
^^^^^ ^^^^^
* Add Shadow Hand environments: ``Isaac-Shadow-Hand-Direct-v0``, ``Isaac-Shadow-Hand-OpenAI-FF-Direct-v0``, ``Isaac-Shadow-Hand-OpenAI-LSTM-Direct-v0``. * Added Shadow Hand environments: ``Isaac-Shadow-Hand-Direct-v0``, ``Isaac-Shadow-Hand-OpenAI-FF-Direct-v0``,
and ``Isaac-Shadow-Hand-OpenAI-LSTM-Direct-v0``.
0.7.1 (2024-05-09) 0.7.1 (2024-05-09)
...@@ -80,7 +81,9 @@ Added ...@@ -80,7 +81,9 @@ Added
Changed Changed
^^^^^^^ ^^^^^^^
* Renamed all references of ``BaseEnv``, ``RLTaskEnv``, and ``OIGEEnv`` to :class:`omni.isaac.lab.envs.ManagerBasedEnv`, :class:`omni.isaac.lab.envs.ManagerBasedRLEnv`, and :class:`omni.isaac.lab.envs.DirectRLEnv`. * Renamed all references of ``BaseEnv``, ``RLTaskEnv``, and ``OIGEEnv`` to
:class:`omni.isaac.lab.envs.ManagerBasedEnv`, :class:`omni.isaac.lab.envs.ManagerBasedRLEnv`,
and :class:`omni.isaac.lab.envs.DirectRLEnv` respectively.
* Split environments into ``manager_based`` and ``direct`` folders. * Split environments into ``manager_based`` and ``direct`` folders.
Added Added
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment