Unverified Commit 6d17b95e authored by Michael Gussert's avatar Michael Gussert Committed by GitHub

Adds documentation for Newton integration (#3271)

Adds a new section in the docs for experimental features.
As the first experimental feature of Isaac Lab, we are also including
instructions for running a feature branch of Isaac lab with Newton. This
change adds in the initial set of documentation for early Newton support
for Isaac lab.

## Checklist

- [ ] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------
Co-authored-by: 's avatarGavriel State <gavrielstate@gmail.com>
Co-authored-by: 's avatarMilad-Rakhsha <miladrakhsha@gmail.com>
Co-authored-by: 's avatarMilad Rakhsha <mrakhsha@nvidia.com>
Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
parent 3a1a65bd
...@@ -109,6 +109,7 @@ Table of Contents ...@@ -109,6 +109,7 @@ Table of Contents
source/overview/showroom source/overview/showroom
source/overview/simple_agents source/overview/simple_agents
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Features :caption: Features
...@@ -119,6 +120,14 @@ Table of Contents ...@@ -119,6 +120,14 @@ Table of Contents
source/features/ray source/features/ray
source/features/reproducibility source/features/reproducibility
.. toctree::
:maxdepth: 3
:caption: Experimental Features
source/experimental-features/bleeding-edge
source/experimental-features/newton-physics-integration/index
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
:caption: Resources :caption: Resources
......
This diff is collapsed.
This diff is collapsed.
...@@ -100,7 +100,7 @@ ...@@ -100,7 +100,7 @@
@article{makoviychuk2021isaac, @article{makoviychuk2021isaac,
title={Isaac gym: High performance gpu-based physics simulation for robot learning}, title={Isaac gym: High performance gpu-based physics simulation for robot learning},
author={Makoviychuk, Viktor and Wawrzyniak, Lukasz and Guo, Yunrong and Lu, Michelle and Storey, Kier and Macklin, Miles and Hoeller, David and Rudin, Nikita and Allshire, Arthur and Handa, Ankur and others}, author={Makoviychuk, Viktor and Wawrzyniak, Lukasz and Guo, Yunrong and Lu, Michelle and Storey, Kier and Macklin, Miles and Hoeller, David and Rudin, Nikita and Allshire, Arthur and Handa, Ankur and State, Gavriel},
journal={arXiv preprint arXiv:2108.10470}, journal={arXiv preprint arXiv:2108.10470},
year={2021} year={2021}
} }
...@@ -108,21 +108,21 @@ ...@@ -108,21 +108,21 @@
@article{handa2022dextreme, @article{handa2022dextreme,
title={DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality}, title={DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality},
author={Handa, Ankur and Allshire, Arthur and Makoviychuk, Viktor and Petrenko, Aleksei and Singh, Ritvik and Liu, Jingzhou and Makoviichuk, Denys and Van Wyk, Karl and Zhurkevich, Alexander and Sundaralingam, Balakumar and others}, author={Handa, Ankur and Allshire, Arthur and Makoviychuk, Viktor and Petrenko, Aleksei and Singh, Ritvik and Liu, Jingzhou and Makoviichuk, Denys and Van Wyk, Karl and Zhurkevich, Alexander and Sundaralingam, Balakumar and Narang, Yashraj and Lafleche, Jean-Francois and Fox, Dieter and State, Gavriel},
journal={arXiv preprint arXiv:2210.13702}, journal={arXiv preprint arXiv:2210.13702},
year={2022} year={2022}
} }
@article{narang2022factory, @article{narang2022factory,
title={Factory: Fast contact for robotic assembly}, title={Factory: Fast contact for robotic assembly},
author={Narang, Yashraj and Storey, Kier and Akinola, Iretiayo and Macklin, Miles and Reist, Philipp and Wawrzyniak, Lukasz and Guo, Yunrong and Moravanszky, Adam and State, Gavriel and Lu, Michelle and others}, author={Narang, Yashraj and Storey, Kier and Akinola, Iretiayo and Macklin, Miles and Reist, Philipp and Wawrzyniak, Lukasz and Guo, Yunrong and Moravanszky, Adam and State, Gavriel and Lu, Michelle and Handa, Ankur and Fox, Dieter},
journal={arXiv preprint arXiv:2205.03532}, journal={arXiv preprint arXiv:2205.03532},
year={2022} year={2022}
} }
@inproceedings{allshire2022transferring, @inproceedings{allshire2022transferring,
title={Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger}, title={Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger},
author={Allshire, Arthur and MittaI, Mayank and Lodaya, Varun and Makoviychuk, Viktor and Makoviichuk, Denys and Widmaier, Felix and W{\"u}thrich, Manuel and Bauer, Stefan and Handa, Ankur and Garg, Animesh}, author={Allshire, Arthur and Mittal, Mayank and Lodaya, Varun and Makoviychuk, Viktor and Makoviichuk, Denys and Widmaier, Felix and W{\"u}thrich, Manuel and Bauer, Stefan and Handa, Ankur and Garg, Animesh},
booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages={11802--11809}, pages={11802--11809},
year={2022}, year={2022},
......
Welcome to the bleeding edge!
=============================
Isaac Lab is open source because our intention is to grow a community of open collaboration for robotic simulation.
We believe that robust tools are crucial for the future of robotics.
Sometimes new features may require extensive changes to the internal structure of Isaac Lab.
Directly integrating such features before they are complete and without feedback from the full community could cause serious issues for users caught unaware.
To address this, some major features will be released as Experimental Feature Branches.
This way, the community can experiment with and contribute to the feature before it's fully integrated, reducing the likelihood of being derailed by unexpected and new errors.
Newton Physics Integration
===========================
`Newton <https://newton-physics.github.io/newton/guide/overview.html>`_ is a GPU-accelerated, extensible, and differentiable physics simulation engine designed for robotics, research,
and advanced simulation workflows. Built on top of `NVIDIA Warp <https://nvidia.github.io/warp/>`_ and integrating MuJoCo Warp, Newton provides high-performance
simulation, modern Python APIs, and a flexible architecture for both users and developers.
Newton is an Open Source community-driven project with contributions from NVIDIA, Google Deep Mind, and Disney Research,
managed through the Linux Foundation.
This `experimental feature branch <https://isaac-sim.github.io/IsaacLab/main/source/experimental-features/index.html>`_ of Isaac Lab provides an initial integration with the Newton Physics Engine, and is
under active development. Many features are not yet supported, and only a limited set of classic RL and flat terrain locomotion
reinforcement learning examples are included at the moment.
Both this Isaac Lab integration branch and Newton itself are under heavy development. We intend to support additional
features for other reinforcement learning and imitation learning workflows in the future, but the above tasks should be
a good lens through which to understand how Newton integration works in Isaac Lab.
We have validated Newton simulation against PhysX by transferring learned policies from Newton to PhysX and vice versa
Furthermore, we have also successfully deployed a Newton-trained locomotion policy to a G1 robot. Please see :ref:`here <sim2real>` for more information.
Newton can support `multiple solvers <https://newton-physics.github.io/newton/api/newton_solvers.html>`_ for handling different types of physics simulation, but for the moment, the Isaac
Lab integration focuses primarily on the MuJoCo-Warp solver.
Future updates of this branch and Newton should include both ongoing improvements in performance as well as integration
with additional solvers.
Note that this branch does not include support for the PhysX physics engine - only Newton is supported. We are considering
several possible paths to continue to support PhysX within Lab, and feedback from users about their needs around that would be appreciated.
During the early development phase of both Newton and this Isaac Lab integration, you are likely to encounter breaking
changes as well as limited documentation. We do not expect to be able to provide official support or debugging assistance
until the framework has reached an official release. We appreciate your understanding and patience as we work to deliver a robust and polished framework!
.. toctree::
:maxdepth: 2
:titlesonly:
installation
training-environments
newton-visualizer
limitations-and-known-bugs
solver-transitioning
sim-to-sim
sim-to-real
Installation
============
Installing the Newton physics integration branch requires three things:
1) Isaac sim 5.0
2) The ``feature/newton`` branch of Isaac Lab
3) Ubuntu 22.04 or 24.04 (Windows will be supported soon)
To begin, verify the version of Isaac Sim by checking the title of the window created when launching the simulation app. Alternatively, you can
find more explicit version information under the ``Help -> About`` menu within the app.
If your version is less than 5.0, you must first `update or reinstall Isaac Sim <https://docs.isaacsim.omniverse.nvidia.com/latest/installation/quick-install.html>`_ before
you can proceed further.
Next, navigate to the root directory of your local copy of the Isaac Lab repository and open a terminal.
Make sure we are on the ``feature/newton`` branch by running the following command:
.. code-block:: bash
git checkout feature/newton
Below, we provide instructions for installing Isaac Sim through pip or binary.
Pip Installation
----------------
We recommend using conda for managing your python environments. Conda can be downloaded and installed from `here <https://docs.conda.io/en/latest/miniconda.html>`_.
Create a new conda environment:
.. code-block:: bash
conda create -n env_isaaclab python=3.11
Activate the environment:
.. code-block:: bash
conda activate env_isaaclab
Install the correct version of torch and torchvision:
.. code-block:: bash
pip install torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cu128
Install Isaac Sim 5.0:
.. code-block:: bash
pip install "isaacsim[all,extscache]==5.0.0" --extra-index-url https://pypi.nvidia.com
Install Isaac Lab extensions and dependencies:
.. code-block:: bash
./isaaclab.sh -i
Binary Installation
-------------------
Follow the Isaac Sim `documentation <https://docs.isaacsim.omniverse.nvidia.com/latest/installation/install_workstation.html>`_ to install Isaac Sim 5.0 binaries.
Enter the Isaac Lab directory:
.. code-block:: bash
cd IsaacLab
Add a symbolic link to the Isaac Sim installation:
.. code-block:: bash
ln -s path_to_isaac_sim _isaac_sim
Install Isaac Lab extensions and dependencies:
.. code-block:: bash
./isaaclab.sh -i
Testing the Installation
------------------------
To verify that the installation was successful, run the following command from the root directory of your Isaac Lab repository:
.. code-block:: bash
./isaaclab.sh -p scripts/environments/zero_agent.py --task Isaac-Cartpole-Direct-v0 --num_envs 128
Limitations
===========
During the early development phase of both Newton and this Isaac Lab integration,
you are likely to encounter breaking changes as well as limited documentation.
We do not expect to be able to provide support or debugging assistance until the framework has reached an official release.
Here is a non-exhaustive list of capabilities currently supported in the Newton experimental feature branch grouped by extension:
* isaaclab:
* Articulation API
* Contact Sensor
* Direct & Manager single agent workflows
* Omniverse Kit visualizer
* Newton visualizer
* isaaclab_assets:
* Anymal-D
* Unitree H1 & G1
* Toy examples
* Cartpole
* Ant
* Humanoid
* isaaclab_tasks:
* Direct:
* Cartpole
* Ant
* Humanoid
* Manager based:
* Locomotion (velocity flat terrain)
* Anymal-D
* Unitree G1
* Unitree H1
Capabilities beyond the above are not currently available.
We expect to support APIs related to rigid bodies soon in order to unlock manipulation based environments.
Newton Visualizer
=================
Newton includes its own built-in visualizer to enable a fast and lightweight way to view the results of simulation.
Many additional features are planned for this system for the future, including the ability to view the results of
training remotely through a web browser. To enable use of the Newton Visualizer use the ``--newton_visualizer`` command line option.
The Newton Visualizer is not capable of or intended to provide camera sensor data for robots being trained. It is solely
intended as a development debugging and visualization tool.
It also currently only supports visualization of collision shapes, not visual shapes.
Both the Omniverse RTX renderer and the Newton Visualizer can be run in parallel, or the Omniverse UI and RTX renderer
can be disabled using the ``--headless`` option.
Using one of our training examples above, training the Cartpole environment, we might choose to disable the Omniverse UI
and RTX renderer using the ``--headless`` option and enable the Newton Visualizer instead as follows:
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-Direct-v0 --num_envs 4096 --headless --newton_visualizer
In general, we do not recommend using the Omniverse UI while training to ensure the fastest possible training times.
The Newton Visualizer has less of a performance penalty while running, and we aim to bring that overhead even lower in the future.
If we would like to run the Omniverse UI and the Newton Visualizer at the same time, for example when running inference using a
lower number of environments, we can omit the ``--headless`` option while still adding the ``--newton_visualizer`` option, as follows:
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Cartpole-Direct-v0 --num_envs 128 --checkpoint logs/rsl_rl/cartpole_direct/2025-08-21_15-45-30/model_299.pt --newton_visualizer
These options are available across all the learning frameworks.
For more information about the Newton Visualizer, please refer to the `Newton documentation <https://newton-physics.github.io/newton/guide/visualization.html>`_ .
.. _sim2real:
Sim-to-Real Policy Transfer
===========================
Deploying policies from simulation to real robots involves important nuances that must be addressed.
This section provides a high-level guide for training policies that can be deployed on a real Unitree G1 robot.
The key challenge is that not all observations available in simulation can be directly measured by real robot sensors.
This means RL-trained policies cannot be directly deployed unless they use only sensor-available observations. For example, while real robot IMU sensors provide angular acceleration (which can be integrated to get angular velocity), they cannot directly measure linear velocity. Therefore, if a policy relies on base linear velocity during training, this information must be removed before real robot deployment.
Requirements
~~~~~~~~~~~~
We assume that policies from this workflow are first verified through sim-to-sim transfer before real robot deployment.
Please see :ref:`here <sim2sim>` for more information.
Overview
--------
This section demonstrates a sim-to-real workflow using teacher–student distillation for the Unitree G1
velocity-tracking task with the Newton backend.
The teacher–student distillation workflow consists of three stages:
1. Train a teacher policy with privileged observations that are not available in real-world sensors.
2. Distill a student policy that excludes privileged terms (e.g., root linear velocity) by behavior cloning from the teacher policy.
3. Fine-tune the student policy with RL using only real-sensor observations.
The teacher and student observation groups are implemented in the velocity task configuration. See the following source for details:
- Teacher observations: ``PolicyCfg(ObsGroup)`` in `velocity_env_cfg.py <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/velocity_env_cfg.py>`__
- Student observations: ``StudentPolicyCfg(ObsGroup)`` in `velocity_env_cfg.py <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/velocity_env_cfg.py>`__
1. Train the teacher policy
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Train the teacher policy for the G1 velocity task using the Newton backend. The task ID is ``Isaac-Velocity-Flat-G1-v1``
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Isaac-Velocity-Flat-G1-v1 --num_envs=4096 --headless
The teacher policy includes privileged observations (e.g., root linear velocity) defined in ``PolicyCfg(ObsGroup)``.
2. Distill the student policy (remove privileged terms)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
During distillation, the student policy learns to mimic the teacher through behavior cloning by minimizing the mean squared error
between their actions: :math:`loss = MSE(\pi(O_{teacher}), \pi(O_{student}))`.
The student policy only uses observations available from real sensors (see ``StudentPolicyCfg(ObsGroup)``
in `velocity_env_cfg.py <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/velocity_env_cfg.py>`__).
Specifically: **Root angular velocity** and **Projected gravity** come from the IMU sensor, **Joint positions and velocities** come from joint encoders, and **Actions** are the joint torques applied by the controller.
Run the student distillation task ``Velocity-G1-Distillation-v1`` using ``--load_run`` and ``--checkpoint`` to specify the teacher policy you want to distill from.
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Velocity-G1-Distillation-v1 --num_envs=4096 --headless --load_run 2025-08-13_23-53-28 --checkpoint model_1499.pt
.. note::
Use the correct ``--load_run`` and ``--checkpoint`` to ensure you distill from the intended teacher policy.
3. Fine-tune the student policy with RL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fine-tune the distilled student policy using RL with the ``Velocity-G1-Student-Finetune-v1`` task.
Use ``--load_run`` and ``--checkpoint`` to initialize from the distilled policy.
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Velocity-G1-Student-Finetune-v1 --num_envs=4096 --headless --load_run 2025-08-20_16-06-52_distillation --checkpoint model_1499.pt
This starts from the distilled student policy and improves it further with RL training.
.. note::
Make sure ``--load_run`` and ``--checkpoint`` point to the correct initial policy (usually the latest checkpoint from the distillation step).
You can replay the student policy via:
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task=Velocity-G1-Student-Finetune-v1 --num_envs=32
This exports the policy as ``.pt`` and ``.onnx`` files in the run's export directory, ready for real robot deployment.
.. _sim2sim:
Sim-to-Sim Policy Transfer
==========================
This section provides examples of sim-to-sim policy transfer using the Newton backend. Sim-to-sim transfer is an essential step before real robot deployment because it verifies that policies work across different simulators. Policies that pass sim-to-sim verification are much more likely to succeed on real robots.
Overview
--------
This guide shows how to run a PhysX-trained policy on the Newton backend. While the method works for any robot and physics engine, it has only been tested with Unitree G1, Unitree H1, and ANYmal-D robots using PhysX-trained policies.
PhysX-trained policies expect joints and links in a specific order determined by how PhysX parses the robot model. However, Newton may parse the same robot with different joint and link ordering.
In the future, we plan to solve this using **robot schema** that standardizes joint and link ordering across different backends.
Currently, we solve this by remapping observations and actions using joint mappings defined in YAML files. These files specify joint names in both PhysX order (source) and Newton order (target). During policy execution, we use this mapping to reorder observations and actions so they work correctly with Newton.
What you need
~~~~~~~~~~~~~
- A policy checkpoint trained with PhysX (RSL-RL).
- A joint mapping YAML for your robot under ``scripts/newton_sim2sim/mappings/``.
- The provided player script: ``scripts/newton_sim2sim/rsl_rl_transfer.py``.
To add a new robot, create a YAML file with two lists where each joint name appears exactly once in both:
.. code-block:: yaml
# Example structure
source_joint_names: # PhysX joint order
- joint_1
- joint_2
# ...
target_joint_names: # Newton joint order
- joint_1
- joint_2
# ...
The script automatically computes the necessary mappings for locomotion tasks.
How to run
~~~~~~~~~~
Use this command template to run a PhysX-trained policy with Newton:
.. code-block:: bash
./isaaclab.sh -p scripts/newton_sim2sim/rsl_rl_transfer.py \
--task=<TASK_ID> \
--num_envs=32 \
--checkpoint <PATH_TO_PHYSX_CHECKPOINT> \
--policy_transfer_file <PATH_TO_MAPPING_YAML>
Here are examples for different robots:
1. Unitree G1
.. code-block:: bash
./isaaclab.sh -p scripts/newton_sim2sim/rsl_rl_transfer.py \
--task=Isaac-Velocity-Flat-G1-v0 \
--num_envs=32 \
--checkpoint <PATH_TO_PHYSX_CHECKPOINT> \
--policy_transfer_file scripts/newton_sim2sim/mappings/sim2sim_g1.yaml
2. Unitree H1
.. code-block:: bash
./isaaclab.sh -p scripts/newton_sim2sim/rsl_rl_transfer.py \
--task=Isaac-Velocity-Flat-H1-v0 \
--num_envs=32 \
--checkpoint <PATH_TO_PHYSX_CHECKPOINT> \
--policy_transfer_file scripts/newton_sim2sim/mappings/sim2sim_h1.yaml
3. ANYmal-D
.. code-block:: bash
./isaaclab.sh -p scripts/newton_sim2sim/rsl_rl_transfer.py \
--task=Isaac-Velocity-Flat-Anymal-D-v0 \
--num_envs=32 \
--checkpoint <PATH_TO_PHYSX_CHECKPOINT> \
--policy_transfer_file scripts/newton_sim2sim/mappings/sim2sim_anymal_d.yaml
Notes and limitations
~~~~~~~~~~~~~~~~~~~~~
- This transfer method has only been tested with Unitree G1, Unitree H1, and ANYmal-D using PhysX-trained policies.
- The observation remapping assumes a locomotion layout with base observations followed by joint observations. For different observation layouts, you'll need to modify ``scripts/newton_sim2sim/policy_mapping.py``.
- When adding new robots or backends, make sure both source and target have identical joint names, and that the YAML lists reflect how each backend orders these joints.
Solver Transitioning
====================
Transitioning to the Newton physics engine introduces new physics solvers that handle simulation using different numerical approaches.
While Newton supports several different solvers, our initial focus for Isaac Lab is on using the MuJoCo-Warp solver from Google DeepMind.
The way the physics scene itself is defined does not change - we continue to use USD as the primary way to set basic parameters of objects and robots in the scene,
and for current environments, the exact same USD files used for the PhysX-based Isaac Lab are used.
In the future, that may change, as new USD schemas are under development that capture additional physics parameters.
What does require change is the way that some solver-specific settings are configured.
Tuning these parameters can have a significant impact on both simulation performance and behaviour.
For now, we will show an example of setting these parameters to help provide a feel for these changes.
Note that the :class:`~isaaclab.sim.NewtonCfg` replaces the :class:`~isaaclab.sim.PhysxCfg` and is used to set everything related to the physical simulation parameters except for the ``dt``:
.. code-block:: python
from isaaclab.sim._impl.newton_manager_cfg import NewtonCfg
from isaaclab.sim._impl.solvers_cfg import MJWarpSolverCfg
solver_cfg = MJWarpSolverCfg(
nefc_per_env=35,
ls_iterations=10,
cone="pyramidal",
ls_parallel=True,
impratio=1,
)
newton_cfg = NewtonCfg(
solver_cfg=solver_cfg,
num_substeps=1,
debug_mode=False,
)
sim: SimulationCfg = SimulationCfg(dt=1 / 120, render_interval=decimation, newton_cfg=newton_cfg)
Here is a very brief explanation of some of the key parameters above:
* ``nefc_per_env``: This is the size of the buffer constraints we want MuJoCo warp to
pre-allocate for a given environment. A large value will slow down the simulation,
while a too small value may lead to some contacts being missed.
* ``ls_iterations``: The number of line searches performed by the MuJoCo Warp solver.
Line searches are used to find an optimal step size, and for each solver step,
at most ``ls_iterations`` line searches will be performed. Keeping this number low
is important for performance. This number is also an upper bound when
``ls_parallel`` is not set.
* ``cone``: This parameter provides a choice between pyramidal and elliptic
approximations for the friction cone used in contact handling. Please see the
MuJoCo documentation for additional information on contact:
https://mujoco.readthedocs.io/en/stable/computation/index.html#contact
* ``ls_parallel``: This switches line searches from iterative to parallel execution.
Enabling ``ls_parallel`` provides a performance boost, but at the cost of some
simulation stability. To ensure good simulation behaviour when enabled, a higher
``ls_iterations`` setting is required. Usually an increase of approximately 50% is
best over the ``ls_iterations`` setting when ``ls_parallel`` is disabled.
* ``impratio``: This is the frictional-to-normal constraint impedance ratio that
enables finer-grained control of the significance of the tangential forces
compared to the normal forces. Larger values signify more emphasis on harder
frictional constraints to avoid slip. More on how to tune this parameter (and
cone) can be found in the MuJoCo documentation here:
https://mujoco.readthedocs.io/en/stable/XMLreference.html#option-impratio
* ``num_substeps``: The number of substeps to perform when running the simulation.
Setting this to a number larger than one allows to decimate the simulation
without requiring Isaac Lab to process data between two substeps. This can be
of value when using implicit actuators, for example.
A more detailed transition guide covering the full set of available parameters and describing tuning approaches will follow in an upcoming release.
Training Environments
======================
To run training, we follow the standard Isaac Lab workflow. If you are new to Isaac Lab, we recommend that you review the `Quickstart Guide here <https://isaac-sim.github.io/IsaacLab/main/source/setup/quickstart.html#>`_.
The currently supported tasks are as follows:
* Isaac-Cartpole-Direct-v0
* Isaac-Ant-Direct-v0
* Isaac-Humanoid-Direct-v0
* Isaac-Velocity-Flat-Anymal-D-v0
* Isaac-Velocity-Flat-G1-v0
* Isaac-Velocity-Flat-G1-v1 (Sim-to-Real tested)
* Isaac-Velocity-Flat-H1-v0
To launch an environment and check that it loads as expected, we can start by trying it out with zero actions sent to its actuators.
This can be done as follows, where ``TASK_NAME`` is the name of the task you’d like to run, and ``NUM_ENVS`` is the number of instances of the task that you’d like to create.
.. code-block:: shell
./isaaclab.sh -p scripts/environments/zero_agent.py --task TASK_NAME --num_envs NUM_ENVS
For cartpole with 128 instances it would look like this:
.. code-block:: shell
./isaaclab.sh -p scripts/environments/zero_agent.py --task Isaac-Cartpole-Direct-v0 --num_envs 128
To run the same environment with random actions we can use a different script:
.. code-block:: shell
./isaaclab.sh -p scripts/environments/random_agent.py --task Isaac-Cartpole-Direct-v0 --num_envs 128
To train the environment we provide hooks to different rl frameworks. See the `Reinforcement Learning Scripts documentation <https://isaac-sim.github.io/IsaacLab/main/source/overview/reinforcement-learning/rl_existing_scripts.html>`_ for more information.
Here are some examples on how to run training on several different RL frameworks. Note that we are explicitly setting the number of environments to
4096 to benefit more from GPU parallelization. We also disable the Omniverse UI visualization to train the environment as quickly as possible by using the ``--headless`` option.
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Cartpole-Direct-v0 --num_envs 4096 --headless
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/skrl/train.py --task Isaac-Cartpole-Direct-v0 --num_envs 4096 --headless
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/rl_games/train.py --task Isaac-Cartpole-Direct-v0 --num_envs 4096 --headless
Once a policy is trained we can visualize it by using the play scripts. But first, we need to find the checkpoint of the trained policy. Typically, these are stored under:
``logs/NAME_OF_RL_FRAMEWORK/TASK_NAME/DATE``.
For instance with our rsl_rl example it could look like this:
``logs/rsl_rl/cartpole_direct/2025-08-21_15-45-30/model_299.pt``
To then run this policy we can use the following command, note that we reduced the number of environments and removed the ``--headless`` option so that we can see our policy in action!
.. code-block:: shell
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Cartpole-Direct-v0 --num_envs 128 --checkpoint logs/rsl_rl/cartpole_direct/2025-08-21_15-45-30/model_299.pt
The same approach applies to all other frameworks.
Note that not all environments are supported in all frameworks. For example, several of the locomotion environments are only supported in the rsl_rl framework.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment