Commit 48531021 authored by oahmednv's avatar oahmednv Committed by Kelly Guo

Adds a tutorial for policy inference in a prebuilt USD scene (#231)

# Description

This PR adds a tutorial to show how to inference on a trained policy in
a prebuilt USD scene.
It includes a script that performs inference of the
`Isaac-Velocity-Rough-H1-v0` environment in a warehouse scene and an
accompanying tutorial that walks through the script implementation.


## Type of change

- This change requires a documentation update

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

---------
Co-authored-by: 's avatarKelly Guo <kellyguo123@hotmail.com>
parent 8ff0b78a
......@@ -31,6 +31,7 @@ Guidelines for modifications:
* Nikita Rudin
* Pascal Roth
* Sheikh Dawood
* Ossama Ahmed
## Contributors
......
.. _tutorial-policy-inference-in-usd:
Policy Inference in USD Environment
===================================
.. currentmodule:: isaaclab
Having learnt how to modify a task in :ref:`tutorial-modify-direct-rl-env`, we will now look at how to run a trained policy in a prebuilt USD scene.
In this tutorial, we will use the RSL RL library and the trained policy from the Humanoid Rough Terrain ``Isaac-Velocity-Rough-H1-v0`` task in a simple warehouse USD.
The Tutorial Code
~~~~~~~~~~~~~~~~~
For this tutorial, we use the trained policy's checkpoint exported as jit (which is an offline version of the policy).
The ``H1RoughEnvCfg_PLAY`` cfg encapsulates the configuration values of the inference environment, including the assets to
be instantiated.
In order to use a prebuilt USD environment instead of the terrain generator specified, we make the
following changes to the config before passing it to the ``ManagerBasedRLEnv``.
.. dropdown:: Code for policy_inference_in_usd.py
:icon: code
.. literalinclude:: ../../../../scripts/tutorials/03_envs/policy_inference_in_usd.py
:language: python
:linenos:
:emphasize-lines: 60-69
Note that we have set the device to ``CPU`` and disabled the use of Fabric for inferencing.
This is because when simulating a small number of environment, CPU simulation can often perform faster than GPU simulation.
The Code Execution
~~~~~~~~~~~~~~~~~~
First, we need to train the ``Isaac-Velocity-Rough-H1-v0`` task by running the following:
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Velocity-Rough-H1-v0 --headless
When the training is finished, we can visualize the result with the following command.
To stop the simulation, you can either close the window, or press ``Ctrl+C`` in the terminal
where you started the simulation.
.. code-block:: bash
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Velocity-Rough-H1-v0 --num_envs 64 --checkpoint logs/rsl_rl/h1_rough/EXPERIMENT_NAME/POLICY_FILE.pt
After running the play script, the policy will be exported to jit and onnx files under the experiment logs directory.
Note that not all learning libraries support exporting the policy to a jit or onnx file.
For libraries that don't currently support this functionality, please refer to the corresponding ``play.py`` script for the library
to learn about how to initialize the policy.
We can then load the warehouse asset and run inference on the H1 robot using the exported jit policy.
.. code-block:: bash
./isaaclab.sh -p scripts/tutorials/03_envs/policy_inference_in_usd.py --checkpoint logs/rsl_rl/h1_rough/EXPERIMENT_NAME/exported/policy.pt
.. figure:: ../../_static/tutorials/tutorial_policy_inference_in_usd.jpg
:align: center
:figwidth: 100%
:alt: result of training Isaac-H1-Direct-v0 task
In this tutorial, we learnt how to make minor modifications to an existing environment config to run policy inference in a prebuilt usd environment.
......@@ -76,6 +76,7 @@ different aspects of the framework to create a simulation environment for agent
03_envs/register_rl_env_gym
03_envs/run_rl_training
03_envs/modify_direct_rl_env
03_envs/policy_inference_in_usd
Integrating Sensors
-------------------
......
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""
This script demonstrates policy inference in a prebuilt USD environment.
In this example, we use a locomotion policy to control the H1 robot. The robot was trained
using Isaac-Velocity-Rough-H1-v0. The robot is commanded to move forward at a constant velocity.
.. code-block:: bash
# Run the script
./isaaclab.sh -p scripts/tutorials/03_envs/policy_inference_in_usd.py --checkpoint /path/to/jit/checkpoint.pt
"""
"""Launch Isaac Sim Simulator first."""
import argparse
from isaaclab.app import AppLauncher
# add argparse arguments
parser = argparse.ArgumentParser(description="Tutorial on inferencing a policy on an H1 robot in a warehouse.")
parser.add_argument("--checkpoint", type=str, help="Path to model checkpoint exported as jit.", required=True)
# append AppLauncher cli args
AppLauncher.add_app_launcher_args(parser)
# parse the arguments
args_cli = parser.parse_args()
# launch omniverse app
app_launcher = AppLauncher(args_cli)
simulation_app = app_launcher.app
"""Rest everything follows."""
import io
import os
import torch
import omni
from isaaclab.envs import ManagerBasedRLEnv
from isaaclab.terrains import TerrainImporterCfg
from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
from isaaclab_tasks.manager_based.locomotion.velocity.config.h1.rough_env_cfg import H1RoughEnvCfg_PLAY
def main():
"""Main function."""
# load the trained jit policy
policy_path = os.path.abspath(args_cli.checkpoint)
file_content = omni.client.read_file(policy_path)[2]
file = io.BytesIO(memoryview(file_content).tobytes())
policy = torch.jit.load(file)
env_cfg = H1RoughEnvCfg_PLAY()
env_cfg.scene.num_envs = 1
env_cfg.curriculum = None
env_cfg.scene.terrain = TerrainImporterCfg(
prim_path="/World/ground",
terrain_type="usd",
usd_path=f"{ISAAC_NUCLEUS_DIR}/Environments/Simple_Warehouse/warehouse.usd",
)
env_cfg.sim.device = "cpu"
env_cfg.sim.use_fabric = False
env = ManagerBasedRLEnv(cfg=env_cfg)
obs, _ = env.reset()
while simulation_app.is_running():
action = policy(obs["policy"]) # run inference
obs, _, _, _, _ = env.step(action)
if __name__ == "__main__":
main()
simulation_app.close()
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment