Commit 8ff0b78a authored by Toni-SM's avatar Toni-SM Committed by Kelly Guo

Adds humanoid AMP tasks for direct workflow (#227)

# Description

This change adds 3 additional Humanoid AMP tasks:

- Isaac-Humanoid-AMP-Dance-Direct-v0
- Isaac-Humanoid-AMP-Run-Direct-v0
- Isaac-Humanoid-AMP-Walk-Direct-v0

In addition, SKRL dependency is updated from 1.3 to 1.4.

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- New feature (non-breaking change which adds functionality)
- This change requires a documentation update

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

---------
Signed-off-by: 's avatarpeterd-NV <peterd@nvidia.com>
Signed-off-by: 's avatarKelly Guo <kellyg@nvidia.com>
Signed-off-by: 's avatarKelly Guo <kellyguo123@hotmail.com>
Signed-off-by: 's avatarAshwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com>
Co-authored-by: 's avatarpeterd-NV <peterd@nvidia.com>
Co-authored-by: 's avatarCY Chen <cyc@nvidia.com>
Co-authored-by: 's avataroahmednv <oahmed@Nvidia.com>
Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
Co-authored-by: 's avatarKelly Guo <kellyguo123@hotmail.com>
Co-authored-by: 's avatarrwiltz <165190220+rwiltz@users.noreply.github.com>
Co-authored-by: 's avatarnv-cupright <92540563+nv-cupright@users.noreply.github.com>
Co-authored-by: 's avatarAlexander Poddubny <143108850+nv-apoddubny@users.noreply.github.com>
Co-authored-by: 's avatarchengronglai <chengrongl@nvidia.com>
Co-authored-by: 's avatarDavid Hoeller <dhoeller@nvidia.com>
Co-authored-by: 's avatarmatthewtrepte <mtrepte@nvidia.com>
Co-authored-by: 's avatarAshwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com>
Co-authored-by: 's avatarKarsten Patzwaldt <kpatzwaldt@nvidia.com>
parent aecf9afd
...@@ -306,16 +306,25 @@ Others ...@@ -306,16 +306,25 @@ Others
.. table:: .. table::
:widths: 33 37 30 :widths: 33 37 30
+----------------+---------------------+-----------------------------------------------------------------------------+ +----------------+---------------------------+-----------------------------------------------------------------------------+
| World | Environment ID | Description | | World | Environment ID | Description |
+================+=====================+=============================================================================+ +================+===========================+=============================================================================+
| |quadcopter| | |quadcopter-link| | Fly and hover the Crazyflie copter at a goal point by applying thrust. | | |quadcopter| | |quadcopter-link| | Fly and hover the Crazyflie copter at a goal point by applying thrust. |
+----------------+---------------------+-----------------------------------------------------------------------------+ +----------------+---------------------------+-----------------------------------------------------------------------------+
| |humanoid_amp| | |humanoid_amp_dance-link| | Move a humanoid robot by imitating different pre-recorded human animations |
| | | (Adversarial Motion Priors). |
| | |humanoid_amp_run-link| | |
| | | |
| | |humanoid_amp_walk-link| | |
+----------------+---------------------------+-----------------------------------------------------------------------------+
.. |quadcopter-link| replace:: `Isaac-Quadcopter-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/quadcopter/quadcopter_env.py>`__ .. |quadcopter-link| replace:: `Isaac-Quadcopter-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/quadcopter/quadcopter_env.py>`__
.. |humanoid_amp_dance-link| replace:: `Isaac-Humanoid-AMP-Dance-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__
.. |humanoid_amp_run-link| replace:: `Isaac-Humanoid-AMP-Run-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__
.. |humanoid_amp_walk-link| replace:: `Isaac-Humanoid-AMP-Walk-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__
.. |quadcopter| image:: ../_static/tasks/others/quadcopter.jpg .. |quadcopter| image:: ../_static/tasks/others/quadcopter.jpg
.. |humanoid_amp| image:: ../_static/tasks/others/humanoid_amp.jpg
Multi-agent Multi-agent
...@@ -385,11 +394,15 @@ Comprehensive List of Environments ...@@ -385,11 +394,15 @@ Comprehensive List of Environments
* - Isaac-Cart-Double-Pendulum-Direct-v0 * - Isaac-Cart-Double-Pendulum-Direct-v0
- -
- Direct - Direct
- **rl_games** (PPO), **skrl** (IPPO, MAPPO, PPO) - **rl_games** (PPO), **skrl** (IPPO, PPO, MAPPO)
* - Isaac-Cartpole-Depth-Camera-Direct-v0 * - Isaac-Cartpole-Depth-Camera-Direct-v0
- -
- Direct - Direct
- **rl_games** (PPO), **skrl** (PPO) - **rl_games** (PPO), **skrl** (PPO)
* - Isaac-Cartpole-Depth-v0
-
- Manager Based
- **rl_games** (PPO)
* - Isaac-Cartpole-Direct-v0 * - Isaac-Cartpole-Direct-v0
- -
- Direct - Direct
...@@ -398,6 +411,18 @@ Comprehensive List of Environments ...@@ -398,6 +411,18 @@ Comprehensive List of Environments
- -
- Direct - Direct
- **rl_games** (PPO), **skrl** (PPO) - **rl_games** (PPO), **skrl** (PPO)
* - Isaac-Cartpole-RGB-ResNet18-v0
-
- Manager Based
- **rl_games** (PPO)
* - Isaac-Cartpole-RGB-TheiaTiny-v0
-
- Manager Based
- **rl_games** (PPO)
* - Isaac-Cartpole-RGB-v0
-
- Manager Based
- **rl_games** (PPO)
* - Isaac-Cartpole-v0 * - Isaac-Cartpole-v0
- -
- Manager Based - Manager Based
...@@ -418,6 +443,18 @@ Comprehensive List of Environments ...@@ -418,6 +443,18 @@ Comprehensive List of Environments
- -
- Direct - Direct
- **rl_games** (PPO), **rsl_rl** (PPO), **skrl** (PPO) - **rl_games** (PPO), **rsl_rl** (PPO), **skrl** (PPO)
* - Isaac-Humanoid-AMP-Dance-Direct-v0
-
- Direct
- **skrl** (AMP)
* - Isaac-Humanoid-AMP-Run-Direct-v0
-
- Direct
- **skrl** (AMP)
* - Isaac-Humanoid-AMP-Walk-Direct-v0
-
- Direct
- **skrl** (AMP)
* - Isaac-Humanoid-Direct-v0 * - Isaac-Humanoid-Direct-v0
- -
- Direct - Direct
...@@ -437,7 +474,11 @@ Comprehensive List of Environments ...@@ -437,7 +474,11 @@ Comprehensive List of Environments
* - Isaac-Lift-Cube-Franka-v0 * - Isaac-Lift-Cube-Franka-v0
- Isaac-Lift-Cube-Franka-Play-v0 - Isaac-Lift-Cube-Franka-Play-v0
- Manager Based - Manager Based
- **rsl_rl** (PPO), **skrl** (PPO), **rl_games** (PPO) - **rsl_rl** (PPO), **skrl** (PPO), **rl_games** (PPO), **sb3** (PPO)
* - Isaac-Lift-Teddy-Bear-Franka-IK-Abs-v0
-
- Manager Based
-
* - Isaac-Navigation-Flat-Anymal-C-v0 * - Isaac-Navigation-Flat-Anymal-C-v0
- Isaac-Navigation-Flat-Anymal-C-Play-v0 - Isaac-Navigation-Flat-Anymal-C-Play-v0
- Manager Based - Manager Based
...@@ -509,7 +550,23 @@ Comprehensive List of Environments ...@@ -509,7 +550,23 @@ Comprehensive List of Environments
* - Isaac-Shadow-Hand-Over-Direct-v0 * - Isaac-Shadow-Hand-Over-Direct-v0
- -
- Direct - Direct
- **rl_games** (PPO), **skrl** (IPPO, MAPPO, PPO) - **rl_games** (PPO), **skrl** (IPPO, PPO, MAPPO)
* - Isaac-Stack-Cube-Franka-IK-Rel-v0
-
- Manager Based
-
* - Isaac-Stack-Cube-Franka-v0
-
- Manager Based
-
* - Isaac-Stack-Cube-Instance-Randomize-Franka-IK-Rel-v0
-
- Manager Based
-
* - Isaac-Stack-Cube-Instance-Randomize-Franka-v0
-
- Manager Based
-
* - Isaac-Velocity-Flat-Anymal-B-v0 * - Isaac-Velocity-Flat-Anymal-B-v0
- Isaac-Velocity-Flat-Anymal-B-Play-v0 - Isaac-Velocity-Flat-Anymal-B-Play-v0
- Manager Based - Manager Based
......
...@@ -42,9 +42,10 @@ parser.add_argument( ...@@ -42,9 +42,10 @@ parser.add_argument(
"--algorithm", "--algorithm",
type=str, type=str,
default="PPO", default="PPO",
choices=["PPO", "IPPO", "MAPPO"], choices=["AMP", "PPO", "IPPO", "MAPPO"],
help="The RL algorithm used for training the skrl agent.", help="The RL algorithm used for training the skrl agent.",
) )
parser.add_argument("--real-time", action="store_true", default=False, help="Run in real-time, if possible.")
# append AppLauncher cli args # append AppLauncher cli args
AppLauncher.add_app_launcher_args(parser) AppLauncher.add_app_launcher_args(parser)
...@@ -61,13 +62,14 @@ simulation_app = app_launcher.app ...@@ -61,13 +62,14 @@ simulation_app = app_launcher.app
import gymnasium as gym import gymnasium as gym
import os import os
import time
import torch import torch
import skrl import skrl
from packaging import version from packaging import version
# check for minimum supported skrl version # check for minimum supported skrl version
SKRL_VERSION = "1.3.0" SKRL_VERSION = "1.4.0"
if version.parse(skrl.__version__) < version.parse(SKRL_VERSION): if version.parse(skrl.__version__) < version.parse(SKRL_VERSION):
skrl.logger.error( skrl.logger.error(
f"Unsupported skrl version: {skrl.__version__}. " f"Unsupported skrl version: {skrl.__version__}. "
...@@ -133,6 +135,12 @@ def main(): ...@@ -133,6 +135,12 @@ def main():
if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]: if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
env = multi_agent_to_single_agent(env) env = multi_agent_to_single_agent(env)
# get environment (physics) dt for real-time evaluation
try:
dt = env.physics_dt
except AttributeError:
dt = env.unwrapped.physics_dt
# wrap for video recording # wrap for video recording
if args_cli.video: if args_cli.video:
video_kwargs = { video_kwargs = {
...@@ -165,18 +173,26 @@ def main(): ...@@ -165,18 +173,26 @@ def main():
timestep = 0 timestep = 0
# simulate environment # simulate environment
while simulation_app.is_running(): while simulation_app.is_running():
start_time = time.time()
# run everything in inference mode # run everything in inference mode
with torch.inference_mode(): with torch.inference_mode():
# agent stepping # agent stepping
actions = runner.agent.act(obs, timestep=0, timesteps=0)[0] outputs = runner.agent.act(obs, timestep=0, timesteps=0)
actions = outputs[-1].get("mean_actions", outputs[0])
# env stepping # env stepping
obs, _, _, _, _ = env.step(actions) obs, _, _, _, _ = env.step(actions)
if args_cli.video: if args_cli.video:
timestep += 1 timestep += 1
# Exit the play loop after recording one video # exit the play loop after recording one video
if timestep == args_cli.video_length: if timestep == args_cli.video_length:
break break
# time delay for real-time evaluation
sleep_time = dt - (time.time() - start_time)
if args_cli.real_time and sleep_time > 0:
time.sleep(sleep_time)
# close the simulator # close the simulator
env.close() env.close()
......
...@@ -40,7 +40,7 @@ parser.add_argument( ...@@ -40,7 +40,7 @@ parser.add_argument(
"--algorithm", "--algorithm",
type=str, type=str,
default="PPO", default="PPO",
choices=["PPO", "IPPO", "MAPPO"], choices=["AMP", "PPO", "IPPO", "MAPPO"],
help="The RL algorithm used for training the skrl agent.", help="The RL algorithm used for training the skrl agent.",
) )
...@@ -70,7 +70,7 @@ import skrl ...@@ -70,7 +70,7 @@ import skrl
from packaging import version from packaging import version
# check for minimum supported skrl version # check for minimum supported skrl version
SKRL_VERSION = "1.3.0" SKRL_VERSION = "1.4.0"
if version.parse(skrl.__version__) < version.parse(SKRL_VERSION): if version.parse(skrl.__version__) < version.parse(SKRL_VERSION):
skrl.logger.error( skrl.logger.error(
f"Unsupported skrl version: {skrl.__version__}. " f"Unsupported skrl version: {skrl.__version__}. "
......
[package] [package]
# Semantic Versioning is used: https://semver.org/ # Semantic Versioning is used: https://semver.org/
version = "0.2.0" version = "0.2.1"
# Description # Description
title = "Isaac Lab Assets" title = "Isaac Lab Assets"
......
Changelog Changelog
--------- ---------
0.2.1 (2025-01-14)
~~~~~~~~~~~~~~~~~~
Added
^^^^^
* Added configuration for the Humanoid-28 robot.
0.2.0 (2024-12-27) 0.2.0 (2024-12-27)
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
......
...@@ -14,6 +14,7 @@ from .cart_double_pendulum import * ...@@ -14,6 +14,7 @@ from .cart_double_pendulum import *
from .cartpole import * from .cartpole import *
from .franka import * from .franka import *
from .humanoid import * from .humanoid import *
from .humanoid_28 import *
from .kinova import * from .kinova import *
from .quadcopter import * from .quadcopter import *
from .ridgeback_franka import * from .ridgeback_franka import *
......
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Configuration for the 28-DOFs Mujoco Humanoid robot."""
from __future__ import annotations
import isaaclab.sim as sim_utils
from isaaclab.actuators import ImplicitActuatorCfg
from isaaclab.assets import ArticulationCfg
from isaaclab.utils.assets import ISAACLAB_NUCLEUS_DIR
##
# Configuration
##
HUMANOID_28_CFG = ArticulationCfg(
prim_path="{ENV_REGEX_NS}/Robot",
spawn=sim_utils.UsdFileCfg(
usd_path=f"{ISAACLAB_NUCLEUS_DIR}/Robots/Classic/Humanoid28/humanoid_28.usd",
rigid_props=sim_utils.RigidBodyPropertiesCfg(
disable_gravity=None,
max_depenetration_velocity=10.0,
enable_gyroscopic_forces=True,
),
articulation_props=sim_utils.ArticulationRootPropertiesCfg(
enabled_self_collisions=True,
solver_position_iteration_count=4,
solver_velocity_iteration_count=0,
sleep_threshold=0.005,
stabilization_threshold=0.001,
),
copy_from_source=False,
),
init_state=ArticulationCfg.InitialStateCfg(
pos=(0.0, 0.0, 0.8),
joint_pos={".*": 0.0},
),
actuators={
"body": ImplicitActuatorCfg(
joint_names_expr=[".*"],
stiffness=None,
damping=None,
),
},
)
"""Configuration for the 28-DOFs Mujoco Humanoid robot."""
...@@ -42,7 +42,7 @@ PYTORCH_INDEX_URL = ["https://download.pytorch.org/whl/cu118"] ...@@ -42,7 +42,7 @@ PYTORCH_INDEX_URL = ["https://download.pytorch.org/whl/cu118"]
# Extra dependencies for RL agents # Extra dependencies for RL agents
EXTRAS_REQUIRE = { EXTRAS_REQUIRE = {
"sb3": ["stable-baselines3>=2.1"], "sb3": ["stable-baselines3>=2.1"],
"skrl": ["skrl>=1.3.0"], "skrl": ["skrl>=1.4.0"],
"rl-games": ["rl-games==1.6.1", "gym"], # rl-games still needs gym :( "rl-games": ["rl-games==1.6.1", "gym"], # rl-games still needs gym :(
"rsl-rl": ["rsl-rl@git+https://github.com/leggedrobotics/rsl_rl.git"], "rsl-rl": ["rsl-rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
} }
......
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.10.21" version = "0.10.22"
# Description # Description
title = "Isaac Lab Environments" title = "Isaac Lab Environments"
......
Changelog Changelog
--------- ---------
0.10.22 (2025-01-14)
~~~~~~~~~~~~~~~~~~~
Added
^^^^^
* Added ``Isaac-Humanoid-AMP-Dance-Direct-v0``, ``Isaac-Humanoid-AMP-Run-Direct-v0`` and ``Isaac-Humanoid-AMP-Walk-Direct-v0``
environments as a direct RL env that implements the Humanoid AMP task.
0.10.21 (2025-01-03) 0.10.21 (2025-01-03)
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
......
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""
AMP Humanoid locomotion environment.
"""
import gymnasium as gym
from . import agents
##
# Register Gym environments.
##
gym.register(
id="Isaac-Humanoid-AMP-Dance-Direct-v0",
entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpDanceEnvCfg",
"skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_dance_amp_cfg.yaml",
},
)
gym.register(
id="Isaac-Humanoid-AMP-Run-Direct-v0",
entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpRunEnvCfg",
"skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_run_amp_cfg.yaml",
},
)
gym.register(
id="Isaac-Humanoid-AMP-Walk-Direct-v0",
entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpWalkEnvCfg",
"skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_walk_amp_cfg.yaml",
},
)
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
seed: 42
# Models are instantiated using skrl's model instantiator utility
# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
models:
separate: True
policy: # see gaussian_model parameters
class: GaussianMixin
clip_actions: False
clip_log_std: True
min_log_std: -20.0
max_log_std: 2.0
initial_log_std: -2.9
fixed_log_std: True
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ACTIONS
value: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
discriminator: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
# Rollout memory
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
memory:
class: RandomMemory
memory_size: -1 # automatically determined (same as agent:rollouts)
# AMP memory (reference motion dataset)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
motion_dataset:
class: RandomMemory
memory_size: 200000
# AMP memory (preventing discriminator overfitting)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
reply_buffer:
class: RandomMemory
memory_size: 1000000
# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
agent:
class: AMP
rollouts: 16
learning_epochs: 6
mini_batches: 2
discount_factor: 0.99
lambda: 0.95
learning_rate: 5.0e-05
learning_rate_scheduler: null
learning_rate_scheduler_kwargs: null
state_preprocessor: RunningStandardScaler
state_preprocessor_kwargs: null
value_preprocessor: RunningStandardScaler
value_preprocessor_kwargs: null
amp_state_preprocessor: RunningStandardScaler
amp_state_preprocessor_kwargs: null
random_timesteps: 0
learning_starts: 0
grad_norm_clip: 0.0
ratio_clip: 0.2
value_clip: 0.2
clip_predicted_values: True
entropy_loss_scale: 0.0
value_loss_scale: 2.5
discriminator_loss_scale: 5.0
amp_batch_size: 512
task_reward_weight: 0.0
style_reward_weight: 1.0
discriminator_batch_size: 4096
discriminator_reward_scale: 2.0
discriminator_logit_regularization_scale: 0.05
discriminator_gradient_penalty_scale: 5.0
discriminator_weight_decay_scale: 1.0e-04
# rewards_shaper_scale: 1.0
time_limit_bootstrap: False
# logging and checkpoint
experiment:
directory: "humanoid_amp_dance"
experiment_name: ""
write_interval: auto
checkpoint_interval: auto
# Sequential trainer
# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
trainer:
class: SequentialTrainer
timesteps: 80000
environment_info: log
seed: 42
# Models are instantiated using skrl's model instantiator utility
# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
models:
separate: True
policy: # see gaussian_model parameters
class: GaussianMixin
clip_actions: False
clip_log_std: True
min_log_std: -20.0
max_log_std: 2.0
initial_log_std: -2.9
fixed_log_std: True
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ACTIONS
value: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
discriminator: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
# Rollout memory
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
memory:
class: RandomMemory
memory_size: -1 # automatically determined (same as agent:rollouts)
# AMP memory (reference motion dataset)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
motion_dataset:
class: RandomMemory
memory_size: 200000
# AMP memory (preventing discriminator overfitting)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
reply_buffer:
class: RandomMemory
memory_size: 1000000
# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
agent:
class: AMP
rollouts: 16
learning_epochs: 6
mini_batches: 2
discount_factor: 0.99
lambda: 0.95
learning_rate: 5.0e-05
learning_rate_scheduler: null
learning_rate_scheduler_kwargs: null
state_preprocessor: RunningStandardScaler
state_preprocessor_kwargs: null
value_preprocessor: RunningStandardScaler
value_preprocessor_kwargs: null
amp_state_preprocessor: RunningStandardScaler
amp_state_preprocessor_kwargs: null
random_timesteps: 0
learning_starts: 0
grad_norm_clip: 0.0
ratio_clip: 0.2
value_clip: 0.2
clip_predicted_values: True
entropy_loss_scale: 0.0
value_loss_scale: 2.5
discriminator_loss_scale: 5.0
amp_batch_size: 512
task_reward_weight: 0.0
style_reward_weight: 1.0
discriminator_batch_size: 4096
discriminator_reward_scale: 2.0
discriminator_logit_regularization_scale: 0.05
discriminator_gradient_penalty_scale: 5.0
discriminator_weight_decay_scale: 1.0e-04
# rewards_shaper_scale: 1.0
time_limit_bootstrap: False
# logging and checkpoint
experiment:
directory: "humanoid_amp_run"
experiment_name: ""
write_interval: auto
checkpoint_interval: auto
# Sequential trainer
# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
trainer:
class: SequentialTrainer
timesteps: 80000
environment_info: log
seed: 42
# Models are instantiated using skrl's model instantiator utility
# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
models:
separate: True
policy: # see gaussian_model parameters
class: GaussianMixin
clip_actions: False
clip_log_std: True
min_log_std: -20.0
max_log_std: 2.0
initial_log_std: -2.9
fixed_log_std: True
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ACTIONS
value: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
discriminator: # see deterministic_model parameters
class: DeterministicMixin
clip_actions: False
network:
- name: net
input: STATES
layers: [1024, 512]
activations: relu
output: ONE
# Rollout memory
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
memory:
class: RandomMemory
memory_size: -1 # automatically determined (same as agent:rollouts)
# AMP memory (reference motion dataset)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
motion_dataset:
class: RandomMemory
memory_size: 200000
# AMP memory (preventing discriminator overfitting)
# https://skrl.readthedocs.io/en/latest/api/memories/random.html
reply_buffer:
class: RandomMemory
memory_size: 1000000
# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
agent:
class: AMP
rollouts: 16
learning_epochs: 6
mini_batches: 2
discount_factor: 0.99
lambda: 0.95
learning_rate: 5.0e-05
learning_rate_scheduler: null
learning_rate_scheduler_kwargs: null
state_preprocessor: RunningStandardScaler
state_preprocessor_kwargs: null
value_preprocessor: RunningStandardScaler
value_preprocessor_kwargs: null
amp_state_preprocessor: RunningStandardScaler
amp_state_preprocessor_kwargs: null
random_timesteps: 0
learning_starts: 0
grad_norm_clip: 0.0
ratio_clip: 0.2
value_clip: 0.2
clip_predicted_values: True
entropy_loss_scale: 0.0
value_loss_scale: 2.5
discriminator_loss_scale: 5.0
amp_batch_size: 512
task_reward_weight: 0.0
style_reward_weight: 1.0
discriminator_batch_size: 4096
discriminator_reward_scale: 2.0
discriminator_logit_regularization_scale: 0.05
discriminator_gradient_penalty_scale: 5.0
discriminator_weight_decay_scale: 1.0e-04
# rewards_shaper_scale: 1.0
time_limit_bootstrap: False
# logging and checkpoint
experiment:
directory: "humanoid_amp_walk"
experiment_name: ""
write_interval: auto
checkpoint_interval: auto
# Sequential trainer
# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
trainer:
class: SequentialTrainer
timesteps: 80000
environment_info: log
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
import os
from dataclasses import MISSING
from isaaclab_assets import HUMANOID_28_CFG
from isaaclab.actuators import ImplicitActuatorCfg
from isaaclab.assets import ArticulationCfg
from isaaclab.envs import DirectRLEnvCfg
from isaaclab.scene import InteractiveSceneCfg
from isaaclab.sim import PhysxCfg, SimulationCfg
from isaaclab.utils import configclass
MOTIONS_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "motions")
@configclass
class HumanoidAmpEnvCfg(DirectRLEnvCfg):
"""Humanoid AMP environment config (base class)."""
# env
episode_length_s = 10.0
decimation = 2
# spaces
observation_space = 81
action_space = 28
state_space = 0
num_amp_observations = 2
amp_observation_space = 81
early_termination = True
termination_height = 0.5
motion_file: str = MISSING
reference_body = "torso"
reset_strategy = "random" # default, random, random-start
"""Strategy to be followed when resetting each environment (humanoid's pose and joint states).
* default: pose and joint states are set to the initial state of the asset.
* random: pose and joint states are set by sampling motions at random, uniform times.
* random-start: pose and joint states are set by sampling motion at the start (time zero).
"""
# simulation
sim: SimulationCfg = SimulationCfg(
dt=1 / 60,
render_interval=decimation,
physx=PhysxCfg(
gpu_found_lost_pairs_capacity=2**23,
gpu_total_aggregate_pairs_capacity=2**23,
),
)
# scene
scene: InteractiveSceneCfg = InteractiveSceneCfg(num_envs=4096, env_spacing=10.0, replicate_physics=True)
# robot
robot: ArticulationCfg = HUMANOID_28_CFG.replace(prim_path="/World/envs/env_.*/Robot").replace(
actuators={
"body": ImplicitActuatorCfg(
joint_names_expr=[".*"],
velocity_limit=100.0,
stiffness=None,
damping=None,
),
},
)
@configclass
class HumanoidAmpDanceEnvCfg(HumanoidAmpEnvCfg):
motion_file = os.path.join(MOTIONS_DIR, "humanoid_dance.npz")
@configclass
class HumanoidAmpRunEnvCfg(HumanoidAmpEnvCfg):
motion_file = os.path.join(MOTIONS_DIR, "humanoid_run.npz")
@configclass
class HumanoidAmpWalkEnvCfg(HumanoidAmpEnvCfg):
motion_file = os.path.join(MOTIONS_DIR, "humanoid_walk.npz")
# Motion files
The motion files are in NumPy-file format that contains data from the skeleton DOFs and bodies that perform the motion.
The data (accessed by key) is described in the following table, where:
* `N` is the number of motion frames recorded
* `D` is the number of skeleton DOFs
* `B` is the number of skeleton bodies
| Key | Dtype | Shape | Description |
| --- | ---- | ----- | ----------- |
| `fps` | int64 | () | FPS at which motion was sampled |
| `dof_names` | unicode string | (D,) | Skeleton DOF names |
| `body_names` | unicode string | (B,) | Skeleton body names |
| `dof_positions` | float32 | (N, D) | Skeleton DOF positions |
| `dof_velocities` | float32 | (N, D) | Skeleton DOF velocities |
| `body_positions` | float32 | (N, B, 3) | Skeleton body positions |
| `body_rotations` | float32 | (N, B, 4) | Skeleton body rotations (as `wxyz` quaternion) |
| `body_linear_velocities` | float32 | (N, B, 3) | Skeleton body linear velocities |
| `body_angular_velocities` | float32 | (N, B, 3) | Skeleton body angular velocities |
## Motion visualization
The `motion_viewer.py` file allows to visualize the skeleton motion recorded in a motion file.
Open an terminal in the `motions` folder and run the following command.
```bash
python motion_viewer.py --file MOTION_FILE_NAME.npz
```
See `python motion_viewer.py --help` for available arguments.
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""
AMP Motion Loader and motion files.
"""
from .motion_loader import MotionLoader
# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
import matplotlib
import matplotlib.animation
import matplotlib.pyplot as plt
import numpy as np
import torch
import mpl_toolkits.mplot3d # noqa: F401
from motion_loader import MotionLoader
class MotionViewer:
"""
Helper class to visualize motion data from NumPy-file format.
"""
def __init__(self, motion_file: str, device: torch.device | str = "cpu", render_scene: bool = False) -> None:
"""Load a motion file and initialize the internal variables.
Args:
motion_file: Motion file path to load.
device: The device to which to load the data.
render_scene: Whether the scene (space occupied by the skeleton during movement)
is rendered instead of a reduced view of the skeleton.
Raises:
AssertionError: If the specified motion file doesn't exist.
"""
self._figure = None
self._figure_axes = None
self._render_scene = render_scene
# load motions
self._motion_loader = MotionLoader(motion_file=motion_file, device=device)
self._num_frames = self._motion_loader.num_frames
self._current_frame = 0
self._body_positions = self._motion_loader.body_positions.cpu().numpy()
print("\nBody")
for i, name in enumerate(self._motion_loader.body_names):
minimum = np.min(self._body_positions[:, i], axis=0).round(decimals=2)
maximum = np.max(self._body_positions[:, i], axis=0).round(decimals=2)
print(f" |-- [{name}] minimum position: {minimum}, maximum position: {maximum}")
def _drawing_callback(self, frame: int) -> None:
"""Drawing callback called each frame"""
# get current motion frame
# get data
vertices = self._body_positions[self._current_frame]
# draw skeleton state
self._figure_axes.clear()
self._figure_axes.scatter(*vertices.T, color="black", depthshade=False)
# adjust exes according to motion view
# - scene
if self._render_scene:
# compute axes limits
minimum = np.min(self._body_positions.reshape(-1, 3), axis=0)
maximum = np.max(self._body_positions.reshape(-1, 3), axis=0)
center = 0.5 * (maximum + minimum)
diff = 0.75 * (maximum - minimum)
# - skeleton
else:
# compute axes limits
minimum = np.min(vertices, axis=0)
maximum = np.max(vertices, axis=0)
center = 0.5 * (maximum + minimum)
diff = np.array([0.75 * np.max(maximum - minimum).item()] * 3)
# scale view
self._figure_axes.set_xlim((center[0] - diff[0], center[0] + diff[0]))
self._figure_axes.set_ylim((center[1] - diff[1], center[1] + diff[1]))
self._figure_axes.set_zlim((center[2] - diff[2], center[2] + diff[2]))
self._figure_axes.set_box_aspect(aspect=diff / diff[0])
# plot ground plane
x, y = np.meshgrid([center[0] - diff[0], center[0] + diff[0]], [center[1] - diff[1], center[1] + diff[1]])
self._figure_axes.plot_surface(x, y, np.zeros_like(x), color="green", alpha=0.2)
# print metadata
self._figure_axes.set_xlabel("X")
self._figure_axes.set_ylabel("Y")
self._figure_axes.set_zlabel("Z")
self._figure_axes.set_title(f"frame: {self._current_frame}/{self._num_frames}")
# increase frame counter
self._current_frame += 1
if self._current_frame >= self._num_frames:
self._current_frame = 0
def show(self) -> None:
"""Show motion"""
# create a 3D figure
self._figure = plt.figure()
self._figure_axes = self._figure.add_subplot(projection="3d")
# matplotlib animation (the instance must live as long as the animation will run)
self._animation = matplotlib.animation.FuncAnimation(
fig=self._figure,
func=self._drawing_callback,
frames=self._num_frames,
interval=1000 * self._motion_loader.dt,
)
plt.show()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--file", type=str, required=True, help="Motion file")
parser.add_argument(
"--render-scene",
action="store_true",
default=False,
help=(
"Whether the scene (space occupied by the skeleton during movement) is rendered instead of a reduced view"
" of the skeleton."
),
)
parser.add_argument("--matplotlib-backend", type=str, default="TkAgg", help="Matplotlib interactive backend")
args, _ = parser.parse_known_args()
# https://matplotlib.org/stable/users/explain/figure/backends.html#interactive-backends
matplotlib.use(args.matplotlib_backend)
viewer = MotionViewer(args.file, render_scene=args.render_scene)
viewer.show()
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment