Adds humanoid AMP tasks for direct workflow (#227)

# Description This change adds 3 additional Humanoid AMP tasks: - Isaac-Humanoid-AMP-Dance-Direct-v0 - Isaac-Humanoid-AMP-Run-Direct-v0 - Isaac-Humanoid-AMP-Walk-Direct-v0 In addition, SKRL dependency is updated from 1.3 to 1.4. ## Type of change  - New feature (non-breaking change which adds functionality) - This change requires a documentation update ## Screenshots Please attach before and after screenshots of the change if applicable.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there  --------- Signed-off-by: peterd-NV <peterd@nvidia.com> Signed-off-by: Kelly Guo <kellyg@nvidia.com> Signed-off-by: Kelly Guo <kellyguo123@hotmail.com> Signed-off-by: Ashwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com> Co-authored-by: peterd-NV <peterd@nvidia.com> Co-authored-by: CY Chen <cyc@nvidia.com> Co-authored-by: oahmednv <oahmed@Nvidia.com> Co-authored-by: Kelly Guo <kellyg@nvidia.com> Co-authored-by: Kelly Guo <kellyguo123@hotmail.com> Co-authored-by: rwiltz <165190220+rwiltz@users.noreply.github.com> Co-authored-by: nv-cupright <92540563+nv-cupright@users.noreply.github.com> Co-authored-by: Alexander Poddubny <143108850+nv-apoddubny@users.noreply.github.com> Co-authored-by: chengronglai <chengrongl@nvidia.com> Co-authored-by: David Hoeller <dhoeller@nvidia.com> Co-authored-by: matthewtrepte <mtrepte@nvidia.com> Co-authored-by: Ashwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com> Co-authored-by: Karsten Patzwaldt <kpatzwaldt@nvidia.com>

Adds humanoid AMP tasks for direct workflow (#227)
# Description This change adds 3 additional Humanoid AMP tasks: - Isaac-Humanoid-AMP-Dance-Direct-v0 - Isaac-Humanoid-AMP-Run-Direct-v0 - Isaac-Humanoid-AMP-Walk-Direct-v0 In addition, SKRL dependency is updated from 1.3 to 1.4. ## Type of change  - New feature (non-breaking change which adds functionality) - This change requires a documentation update ## Screenshots Please attach before and after screenshots of the change if applicable.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there  --------- Signed-off-by: peterd-NV <peterd@nvidia.com> Signed-off-by: Kelly Guo <kellyg@nvidia.com> Signed-off-by: Kelly Guo <kellyguo123@hotmail.com> Signed-off-by: Ashwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com> Co-authored-by: peterd-NV <peterd@nvidia.com> Co-authored-by: CY Chen <cyc@nvidia.com> Co-authored-by: oahmednv <oahmed@Nvidia.com> Co-authored-by: Kelly Guo <kellyg@nvidia.com> Co-authored-by: Kelly Guo <kellyguo123@hotmail.com> Co-authored-by: rwiltz <165190220+rwiltz@users.noreply.github.com> Co-authored-by: nv-cupright <92540563+nv-cupright@users.noreply.github.com> Co-authored-by: Alexander Poddubny <143108850+nv-apoddubny@users.noreply.github.com> Co-authored-by: chengronglai <chengrongl@nvidia.com> Co-authored-by: David Hoeller <dhoeller@nvidia.com> Co-authored-by: matthewtrepte <mtrepte@nvidia.com> Co-authored-by: Ashwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com> Co-authored-by: Karsten Patzwaldt <kpatzwaldt@nvidia.com>
8ff0b78a · Toni-SM · Kelly Guo · aecf9afd · 8ff0b78a · 8ff0b78a
Commit 8ff0b78a authored Jan 17, 2025 by Toni-SM Committed by Kelly Guo Jan 30, 2025
25 changed files
--- a/docs/source/_static/tasks/others/humanoid_amp.jpg
+++ b/docs/source/_static/tasks/others/humanoid_amp.jpg
--- a/docs/source/overview/environments.rst
+++ b/docs/source/overview/environments.rst
@@ -306,16 +306,25 @@ Others
 .. table::
    :widths: 33 37 30

-    +----------------+---------------------+-----------------------------------------------------------------------------+
-    | World          | Environment ID      | Description                                                                 |
-    +================+=====================+=============================================================================+
-    | |quadcopter|   | |quadcopter-link|   | Fly and hover the Crazyflie copter at a goal point by applying thrust.      |
-    +----------------+---------------------+-----------------------------------------------------------------------------+
+    +----------------+---------------------------+-----------------------------------------------------------------------------+
+    | World          | Environment ID            | Description                                                                 |
+    +================+===========================+=============================================================================+
+    | |quadcopter|   | |quadcopter-link|         | Fly and hover the Crazyflie copter at a goal point by applying thrust.      |
+    +----------------+---------------------------+-----------------------------------------------------------------------------+
+    | |humanoid_amp| | |humanoid_amp_dance-link| | Move a humanoid robot by imitating different pre-recorded human animations  |
+    |                |                           | (Adversarial Motion Priors).                                                |
+    |                | |humanoid_amp_run-link|   |                                                                             |
+    |                |                           |                                                                             |
+    |                | |humanoid_amp_walk-link|  |                                                                             |
+    +----------------+---------------------------+-----------------------------------------------------------------------------+

 .. |quadcopter-link| replace:: `Isaac-Quadcopter-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/quadcopter/quadcopter_env.py>`__
-
+.. |humanoid_amp_dance-link| replace:: `Isaac-Humanoid-AMP-Dance-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__
+.. |humanoid_amp_run-link| replace:: `Isaac-Humanoid-AMP-Run-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__
+.. |humanoid_amp_walk-link| replace:: `Isaac-Humanoid-AMP-Walk-Direct-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py>`__

 .. |quadcopter| image:: ../_static/tasks/others/quadcopter.jpg
+.. |humanoid_amp| image:: ../_static/tasks/others/humanoid_amp.jpg


 Multi-agent
@@ -385,11 +394,15 @@ Comprehensive List of Environments
    * - Isaac-Cart-Double-Pendulum-Direct-v0
      -
      - Direct
-      - **rl_games** (PPO), **skrl** (IPPO, MAPPO, PPO)
+      - **rl_games** (PPO), **skrl** (IPPO, PPO, MAPPO)
    * - Isaac-Cartpole-Depth-Camera-Direct-v0
      -
      - Direct
      - **rl_games** (PPO), **skrl** (PPO)
+    * - Isaac-Cartpole-Depth-v0
+      -
+      - Manager Based
+      - **rl_games** (PPO)
    * - Isaac-Cartpole-Direct-v0
      -
      - Direct
@@ -398,6 +411,18 @@ Comprehensive List of Environments
      -
      - Direct
      - **rl_games** (PPO), **skrl** (PPO)
+    * - Isaac-Cartpole-RGB-ResNet18-v0
+      -
+      - Manager Based
+      - **rl_games** (PPO)
+    * - Isaac-Cartpole-RGB-TheiaTiny-v0
+      -
+      - Manager Based
+      - **rl_games** (PPO)
+    * - Isaac-Cartpole-RGB-v0
+      -
+      - Manager Based
+      - **rl_games** (PPO)
    * - Isaac-Cartpole-v0
      -
      - Manager Based
@@ -418,6 +443,18 @@ Comprehensive List of Environments
      -
      - Direct
      - **rl_games** (PPO), **rsl_rl** (PPO), **skrl** (PPO)
+    * - Isaac-Humanoid-AMP-Dance-Direct-v0
+      -
+      - Direct
+      - **skrl** (AMP)
+    * - Isaac-Humanoid-AMP-Run-Direct-v0
+      -
+      - Direct
+      - **skrl** (AMP)
+    * - Isaac-Humanoid-AMP-Walk-Direct-v0
+      -
+      - Direct
+      - **skrl** (AMP)
    * - Isaac-Humanoid-Direct-v0
      -
      - Direct
@@ -437,7 +474,11 @@ Comprehensive List of Environments
    * - Isaac-Lift-Cube-Franka-v0
      - Isaac-Lift-Cube-Franka-Play-v0
      - Manager Based
-      - **rsl_rl** (PPO), **skrl** (PPO), **rl_games** (PPO)
+      - **rsl_rl** (PPO), **skrl** (PPO), **rl_games** (PPO), **sb3** (PPO)
+    * - Isaac-Lift-Teddy-Bear-Franka-IK-Abs-v0
+      -
+      - Manager Based
+      -
    * - Isaac-Navigation-Flat-Anymal-C-v0
      - Isaac-Navigation-Flat-Anymal-C-Play-v0
      - Manager Based
@@ -509,7 +550,23 @@ Comprehensive List of Environments
    * - Isaac-Shadow-Hand-Over-Direct-v0
      -
      - Direct
-      - **rl_games** (PPO), **skrl** (IPPO, MAPPO, PPO)
+      - **rl_games** (PPO), **skrl** (IPPO, PPO, MAPPO)
+    * - Isaac-Stack-Cube-Franka-IK-Rel-v0
+      -
+      - Manager Based
+      -
+    * - Isaac-Stack-Cube-Franka-v0
+      -
+      - Manager Based
+      -
+    * - Isaac-Stack-Cube-Instance-Randomize-Franka-IK-Rel-v0
+      -
+      - Manager Based
+      -
+    * - Isaac-Stack-Cube-Instance-Randomize-Franka-v0
+      -
+      - Manager Based
+      -
    * - Isaac-Velocity-Flat-Anymal-B-v0
      - Isaac-Velocity-Flat-Anymal-B-Play-v0
      - Manager Based

--- a/scripts/reinforcement_learning/skrl/play.py
+++ b/scripts/reinforcement_learning/skrl/play.py
@@ -42,9 +42,10 @@ parser.add_argument(
    "--algorithm",
    type=str,
    default="PPO",
-    choices=["PPO", "IPPO", "MAPPO"],
+    choices=["AMP", "PPO", "IPPO", "MAPPO"],
    help="The RL algorithm used for training the skrl agent.",
 )
+parser.add_argument("--real-time", action="store_true", default=False, help="Run in real-time, if possible.")

 # append AppLauncher cli args
 AppLauncher.add_app_launcher_args(parser)
@@ -61,13 +62,14 @@ simulation_app = app_launcher.app

 import gymnasium as gym
 import os
+import time
 import torch

 import skrl
 from packaging import version

 # check for minimum supported skrl version
-SKRL_VERSION = "1.3.0"
+SKRL_VERSION = "1.4.0"
 if version.parse(skrl.__version__) < version.parse(SKRL_VERSION):
    skrl.logger.error(
        f"Unsupported skrl version: {skrl.__version__}. "
@@ -133,6 +135,12 @@ def main():
    if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
        env = multi_agent_to_single_agent(env)

+    # get environment (physics) dt for real-time evaluation
+    try:
+        dt = env.physics_dt
+    except AttributeError:
+        dt = env.unwrapped.physics_dt
+
    # wrap for video recording
    if args_cli.video:
        video_kwargs = {
@@ -165,18 +173,26 @@ def main():
    timestep = 0
    # simulate environment
    while simulation_app.is_running():
+        start_time = time.time()
+
        # run everything in inference mode
        with torch.inference_mode():
            # agent stepping
-            actions = runner.agent.act(obs, timestep=0, timesteps=0)[0]
+            outputs = runner.agent.act(obs, timestep=0, timesteps=0)
+            actions = outputs[-1].get("mean_actions", outputs[0])
            # env stepping
            obs, _, _, _, _ = env.step(actions)
        if args_cli.video:
            timestep += 1
-            # Exit the play loop after recording one video
+            # exit the play loop after recording one video
            if timestep == args_cli.video_length:
                break

+        # time delay for real-time evaluation
+        sleep_time = dt - (time.time() - start_time)
+        if args_cli.real_time and sleep_time > 0:
+            time.sleep(sleep_time)
+
    # close the simulator
    env.close()


--- a/scripts/reinforcement_learning/skrl/train.py
+++ b/scripts/reinforcement_learning/skrl/train.py
@@ -40,7 +40,7 @@ parser.add_argument(
    "--algorithm",
    type=str,
    default="PPO",
-    choices=["PPO", "IPPO", "MAPPO"],
+    choices=["AMP", "PPO", "IPPO", "MAPPO"],
    help="The RL algorithm used for training the skrl agent.",
 )

@@ -70,7 +70,7 @@ import skrl
 from packaging import version

 # check for minimum supported skrl version
-SKRL_VERSION = "1.3.0"
+SKRL_VERSION = "1.4.0"
 if version.parse(skrl.__version__) < version.parse(SKRL_VERSION):
    skrl.logger.error(
        f"Unsupported skrl version: {skrl.__version__}. "

--- a/source/isaaclab_assets/config/extension.toml
+++ b/source/isaaclab_assets/config/extension.toml
 [package]
 # Semantic Versioning is used: https://semver.org/
-version = "0.2.0"
+version = "0.2.1"

 # Description
 title =  "Isaac Lab Assets"

--- a/source/isaaclab_assets/docs/CHANGELOG.rst
+++ b/source/isaaclab_assets/docs/CHANGELOG.rst
 Changelog
 ---------

+0.2.1 (2025-01-14)
+~~~~~~~~~~~~~~~~~~
+
+Added
+^^^^^
+
+* Added configuration for the Humanoid-28 robot.
+
+
 0.2.0 (2024-12-27)
 ~~~~~~~~~~~~~~~~~~


--- a/source/isaaclab_assets/isaaclab_assets/robots/__init__.py
+++ b/source/isaaclab_assets/isaaclab_assets/robots/__init__.py
@@ -14,6 +14,7 @@ from .cart_double_pendulum import *
 from .cartpole import *
 from .franka import *
 from .humanoid import *
+from .humanoid_28 import *
 from .kinova import *
 from .quadcopter import *
 from .ridgeback_franka import *

--- a/source/isaaclab_assets/isaaclab_assets/robots/humanoid_28.py
+++ b/source/isaaclab_assets/isaaclab_assets/robots/humanoid_28.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""Configuration for the 28-DOFs Mujoco Humanoid robot."""
+
+from __future__ import annotations
+
+import isaaclab.sim as sim_utils
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.assets import ArticulationCfg
+from isaaclab.utils.assets import ISAACLAB_NUCLEUS_DIR
+
+##
+# Configuration
+##
+
+HUMANOID_28_CFG = ArticulationCfg(
+    prim_path="{ENV_REGEX_NS}/Robot",
+    spawn=sim_utils.UsdFileCfg(
+        usd_path=f"{ISAACLAB_NUCLEUS_DIR}/Robots/Classic/Humanoid28/humanoid_28.usd",
+        rigid_props=sim_utils.RigidBodyPropertiesCfg(
+            disable_gravity=None,
+            max_depenetration_velocity=10.0,
+            enable_gyroscopic_forces=True,
+        ),
+        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
+            enabled_self_collisions=True,
+            solver_position_iteration_count=4,
+            solver_velocity_iteration_count=0,
+            sleep_threshold=0.005,
+            stabilization_threshold=0.001,
+        ),
+        copy_from_source=False,
+    ),
+    init_state=ArticulationCfg.InitialStateCfg(
+        pos=(0.0, 0.0, 0.8),
+        joint_pos={".*": 0.0},
+    ),
+    actuators={
+        "body": ImplicitActuatorCfg(
+            joint_names_expr=[".*"],
+            stiffness=None,
+            damping=None,
+        ),
+    },
+)
+"""Configuration for the 28-DOFs Mujoco Humanoid robot."""
--- a/source/isaaclab_rl/setup.py
+++ b/source/isaaclab_rl/setup.py
@@ -42,7 +42,7 @@ PYTORCH_INDEX_URL = ["https://download.pytorch.org/whl/cu118"]
 # Extra dependencies for RL agents
 EXTRAS_REQUIRE = {
    "sb3": ["stable-baselines3>=2.1"],
-    "skrl": ["skrl>=1.3.0"],
+    "skrl": ["skrl>=1.4.0"],
    "rl-games": ["rl-games==1.6.1", "gym"],  # rl-games still needs gym :(
    "rsl-rl": ["rsl-rl@git+https://github.com/leggedrobotics/rsl_rl.git"],
 }

--- a/source/isaaclab_tasks/config/extension.toml
+++ b/source/isaaclab_tasks/config/extension.toml
 [package]

 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.10.21"
+version = "0.10.22"

 # Description
 title = "Isaac Lab Environments"

--- a/source/isaaclab_tasks/docs/CHANGELOG.rst
+++ b/source/isaaclab_tasks/docs/CHANGELOG.rst
 Changelog
 ---------

+0.10.22 (2025-01-14)
+~~~~~~~~~~~~~~~~~~~
+
+Added
+^^^^^
+
+* Added ``Isaac-Humanoid-AMP-Dance-Direct-v0``, ``Isaac-Humanoid-AMP-Run-Direct-v0`` and ``Isaac-Humanoid-AMP-Walk-Direct-v0``
+  environments as a direct RL env that implements the Humanoid AMP task.
+
+
 0.10.21 (2025-01-03)
 ~~~~~~~~~~~~~~~~~~~~


--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/__init__.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/__init__.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""
+AMP Humanoid locomotion environment.
+"""
+
+import gymnasium as gym
+
+from . import agents
+
+##
+# Register Gym environments.
+##
+
+gym.register(
+    id="Isaac-Humanoid-AMP-Dance-Direct-v0",
+    entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
+    disable_env_checker=True,
+    kwargs={
+        "env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpDanceEnvCfg",
+        "skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_dance_amp_cfg.yaml",
+    },
+)
+
+gym.register(
+    id="Isaac-Humanoid-AMP-Run-Direct-v0",
+    entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
+    disable_env_checker=True,
+    kwargs={
+        "env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpRunEnvCfg",
+        "skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_run_amp_cfg.yaml",
+    },
+)
+
+gym.register(
+    id="Isaac-Humanoid-AMP-Walk-Direct-v0",
+    entry_point=f"{__name__}.humanoid_amp_env:HumanoidAmpEnv",
+    disable_env_checker=True,
+    kwargs={
+        "env_cfg_entry_point": f"{__name__}.humanoid_amp_env_cfg:HumanoidAmpWalkEnvCfg",
+        "skrl_amp_cfg_entry_point": f"{agents.__name__}:skrl_walk_amp_cfg.yaml",
+    },
+)
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/__init__.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/__init__.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_dance_amp_cfg.yaml
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_dance_amp_cfg.yaml
+seed: 42
+
+
+# Models are instantiated using skrl's model instantiator utility
+# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
+models:
+  separate: True
+  policy:  # see gaussian_model parameters
+    class: GaussianMixin
+    clip_actions: False
+    clip_log_std: True
+    min_log_std: -20.0
+    max_log_std: 2.0
+    initial_log_std: -2.9
+    fixed_log_std: True
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ACTIONS
+  value:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+  discriminator:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+
+
+# Rollout memory
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+memory:
+  class: RandomMemory
+  memory_size: -1  # automatically determined (same as agent:rollouts)
+
+# AMP memory (reference motion dataset)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+motion_dataset:
+  class: RandomMemory
+  memory_size: 200000
+
+# AMP memory (preventing discriminator overfitting)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+reply_buffer:
+  class: RandomMemory
+  memory_size: 1000000
+
+
+# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
+# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
+agent:
+  class: AMP
+  rollouts: 16
+  learning_epochs: 6
+  mini_batches: 2
+  discount_factor: 0.99
+  lambda: 0.95
+  learning_rate: 5.0e-05
+  learning_rate_scheduler: null
+  learning_rate_scheduler_kwargs: null
+  state_preprocessor: RunningStandardScaler
+  state_preprocessor_kwargs: null
+  value_preprocessor: RunningStandardScaler
+  value_preprocessor_kwargs: null
+  amp_state_preprocessor: RunningStandardScaler
+  amp_state_preprocessor_kwargs: null
+  random_timesteps: 0
+  learning_starts: 0
+  grad_norm_clip: 0.0
+  ratio_clip: 0.2
+  value_clip: 0.2
+  clip_predicted_values: True
+  entropy_loss_scale: 0.0
+  value_loss_scale: 2.5
+  discriminator_loss_scale: 5.0
+  amp_batch_size: 512
+  task_reward_weight: 0.0
+  style_reward_weight: 1.0
+  discriminator_batch_size: 4096
+  discriminator_reward_scale: 2.0
+  discriminator_logit_regularization_scale: 0.05
+  discriminator_gradient_penalty_scale: 5.0
+  discriminator_weight_decay_scale: 1.0e-04
+  # rewards_shaper_scale: 1.0
+  time_limit_bootstrap: False
+  # logging and checkpoint
+  experiment:
+    directory: "humanoid_amp_dance"
+    experiment_name: ""
+    write_interval: auto
+    checkpoint_interval: auto
+
+
+# Sequential trainer
+# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
+trainer:
+  class: SequentialTrainer
+  timesteps: 80000
+  environment_info: log
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_run_amp_cfg.yaml
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_run_amp_cfg.yaml
+seed: 42
+
+
+# Models are instantiated using skrl's model instantiator utility
+# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
+models:
+  separate: True
+  policy:  # see gaussian_model parameters
+    class: GaussianMixin
+    clip_actions: False
+    clip_log_std: True
+    min_log_std: -20.0
+    max_log_std: 2.0
+    initial_log_std: -2.9
+    fixed_log_std: True
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ACTIONS
+  value:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+  discriminator:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+
+
+# Rollout memory
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+memory:
+  class: RandomMemory
+  memory_size: -1  # automatically determined (same as agent:rollouts)
+
+# AMP memory (reference motion dataset)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+motion_dataset:
+  class: RandomMemory
+  memory_size: 200000
+
+# AMP memory (preventing discriminator overfitting)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+reply_buffer:
+  class: RandomMemory
+  memory_size: 1000000
+
+
+# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
+# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
+agent:
+  class: AMP
+  rollouts: 16
+  learning_epochs: 6
+  mini_batches: 2
+  discount_factor: 0.99
+  lambda: 0.95
+  learning_rate: 5.0e-05
+  learning_rate_scheduler: null
+  learning_rate_scheduler_kwargs: null
+  state_preprocessor: RunningStandardScaler
+  state_preprocessor_kwargs: null
+  value_preprocessor: RunningStandardScaler
+  value_preprocessor_kwargs: null
+  amp_state_preprocessor: RunningStandardScaler
+  amp_state_preprocessor_kwargs: null
+  random_timesteps: 0
+  learning_starts: 0
+  grad_norm_clip: 0.0
+  ratio_clip: 0.2
+  value_clip: 0.2
+  clip_predicted_values: True
+  entropy_loss_scale: 0.0
+  value_loss_scale: 2.5
+  discriminator_loss_scale: 5.0
+  amp_batch_size: 512
+  task_reward_weight: 0.0
+  style_reward_weight: 1.0
+  discriminator_batch_size: 4096
+  discriminator_reward_scale: 2.0
+  discriminator_logit_regularization_scale: 0.05
+  discriminator_gradient_penalty_scale: 5.0
+  discriminator_weight_decay_scale: 1.0e-04
+  # rewards_shaper_scale: 1.0
+  time_limit_bootstrap: False
+  # logging and checkpoint
+  experiment:
+    directory: "humanoid_amp_run"
+    experiment_name: ""
+    write_interval: auto
+    checkpoint_interval: auto
+
+
+# Sequential trainer
+# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
+trainer:
+  class: SequentialTrainer
+  timesteps: 80000
+  environment_info: log
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_walk_amp_cfg.yaml
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_walk_amp_cfg.yaml
+seed: 42
+
+
+# Models are instantiated using skrl's model instantiator utility
+# https://skrl.readthedocs.io/en/latest/api/utils/model_instantiators.html
+models:
+  separate: True
+  policy:  # see gaussian_model parameters
+    class: GaussianMixin
+    clip_actions: False
+    clip_log_std: True
+    min_log_std: -20.0
+    max_log_std: 2.0
+    initial_log_std: -2.9
+    fixed_log_std: True
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ACTIONS
+  value:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+  discriminator:  # see deterministic_model parameters
+    class: DeterministicMixin
+    clip_actions: False
+    network:
+      - name: net
+        input: STATES
+        layers: [1024, 512]
+        activations: relu
+    output: ONE
+
+
+# Rollout memory
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+memory:
+  class: RandomMemory
+  memory_size: -1  # automatically determined (same as agent:rollouts)
+
+# AMP memory (reference motion dataset)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+motion_dataset:
+  class: RandomMemory
+  memory_size: 200000
+
+# AMP memory (preventing discriminator overfitting)
+# https://skrl.readthedocs.io/en/latest/api/memories/random.html
+reply_buffer:
+  class: RandomMemory
+  memory_size: 1000000
+
+
+# AMP agent configuration (field names are from AMP_DEFAULT_CONFIG)
+# https://skrl.readthedocs.io/en/latest/api/agents/amp.html
+agent:
+  class: AMP
+  rollouts: 16
+  learning_epochs: 6
+  mini_batches: 2
+  discount_factor: 0.99
+  lambda: 0.95
+  learning_rate: 5.0e-05
+  learning_rate_scheduler: null
+  learning_rate_scheduler_kwargs: null
+  state_preprocessor: RunningStandardScaler
+  state_preprocessor_kwargs: null
+  value_preprocessor: RunningStandardScaler
+  value_preprocessor_kwargs: null
+  amp_state_preprocessor: RunningStandardScaler
+  amp_state_preprocessor_kwargs: null
+  random_timesteps: 0
+  learning_starts: 0
+  grad_norm_clip: 0.0
+  ratio_clip: 0.2
+  value_clip: 0.2
+  clip_predicted_values: True
+  entropy_loss_scale: 0.0
+  value_loss_scale: 2.5
+  discriminator_loss_scale: 5.0
+  amp_batch_size: 512
+  task_reward_weight: 0.0
+  style_reward_weight: 1.0
+  discriminator_batch_size: 4096
+  discriminator_reward_scale: 2.0
+  discriminator_logit_regularization_scale: 0.05
+  discriminator_gradient_penalty_scale: 5.0
+  discriminator_weight_decay_scale: 1.0e-04
+  # rewards_shaper_scale: 1.0
+  time_limit_bootstrap: False
+  # logging and checkpoint
+  experiment:
+    directory: "humanoid_amp_walk"
+    experiment_name: ""
+    write_interval: auto
+    checkpoint_interval: auto
+
+
+# Sequential trainer
+# https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
+trainer:
+  class: SequentialTrainer
+  timesteps: 80000
+  environment_info: log
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env.py
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/humanoid_amp_env_cfg.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+from __future__ import annotations
+
+import os
+from dataclasses import MISSING
+
+from isaaclab_assets import HUMANOID_28_CFG
+
+from isaaclab.actuators import ImplicitActuatorCfg
+from isaaclab.assets import ArticulationCfg
+from isaaclab.envs import DirectRLEnvCfg
+from isaaclab.scene import InteractiveSceneCfg
+from isaaclab.sim import PhysxCfg, SimulationCfg
+from isaaclab.utils import configclass
+
+MOTIONS_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "motions")
+
+
+@configclass
+class HumanoidAmpEnvCfg(DirectRLEnvCfg):
+    """Humanoid AMP environment config (base class)."""
+
+    # env
+    episode_length_s = 10.0
+    decimation = 2
+
+    # spaces
+    observation_space = 81
+    action_space = 28
+    state_space = 0
+    num_amp_observations = 2
+    amp_observation_space = 81
+
+    early_termination = True
+    termination_height = 0.5
+
+    motion_file: str = MISSING
+    reference_body = "torso"
+    reset_strategy = "random"  # default, random, random-start
+    """Strategy to be followed when resetting each environment (humanoid's pose and joint states).
+
+    * default: pose and joint states are set to the initial state of the asset.
+    * random: pose and joint states are set by sampling motions at random, uniform times.
+    * random-start: pose and joint states are set by sampling motion at the start (time zero).
+    """
+
+    # simulation
+    sim: SimulationCfg = SimulationCfg(
+        dt=1 / 60,
+        render_interval=decimation,
+        physx=PhysxCfg(
+            gpu_found_lost_pairs_capacity=2**23,
+            gpu_total_aggregate_pairs_capacity=2**23,
+        ),
+    )
+
+    # scene
+    scene: InteractiveSceneCfg = InteractiveSceneCfg(num_envs=4096, env_spacing=10.0, replicate_physics=True)
+
+    # robot
+    robot: ArticulationCfg = HUMANOID_28_CFG.replace(prim_path="/World/envs/env_.*/Robot").replace(
+        actuators={
+            "body": ImplicitActuatorCfg(
+                joint_names_expr=[".*"],
+                velocity_limit=100.0,
+                stiffness=None,
+                damping=None,
+            ),
+        },
+    )
+
+
+@configclass
+class HumanoidAmpDanceEnvCfg(HumanoidAmpEnvCfg):
+    motion_file = os.path.join(MOTIONS_DIR, "humanoid_dance.npz")
+
+
+@configclass
+class HumanoidAmpRunEnvCfg(HumanoidAmpEnvCfg):
+    motion_file = os.path.join(MOTIONS_DIR, "humanoid_run.npz")
+
+
+@configclass
+class HumanoidAmpWalkEnvCfg(HumanoidAmpEnvCfg):
+    motion_file = os.path.join(MOTIONS_DIR, "humanoid_walk.npz")
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/README.md
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/README.md
+# Motion files
+
+The motion files are in NumPy-file format that contains data from the skeleton DOFs and bodies that perform the motion.
+
+The data (accessed by key) is described in the following table, where:
+
+* `N` is the number of motion frames recorded
+* `D` is the number of skeleton DOFs
+* `B` is the number of skeleton bodies
+
+| Key | Dtype | Shape | Description |
+| --- | ---- | ----- | ----------- |
+| `fps` | int64 | () | FPS at which motion was sampled |
+| `dof_names` | unicode string | (D,) | Skeleton DOF names |
+| `body_names` | unicode string | (B,) | Skeleton body names |
+| `dof_positions` | float32 | (N, D) | Skeleton DOF positions |
+| `dof_velocities` | float32 | (N, D) | Skeleton DOF velocities |
+| `body_positions` | float32 | (N, B, 3) | Skeleton body positions |
+| `body_rotations` | float32 | (N, B, 4) | Skeleton body rotations (as `wxyz` quaternion) |
+| `body_linear_velocities` | float32 | (N, B, 3) | Skeleton body linear velocities |
+| `body_angular_velocities` | float32 | (N, B, 3) | Skeleton body angular velocities |
+
+## Motion visualization
+
+The `motion_viewer.py` file allows to visualize the skeleton motion recorded in a motion file.
+
+Open an terminal in the `motions` folder and run the following command.
+
+```bash
+python motion_viewer.py --file MOTION_FILE_NAME.npz
+```
+
+See `python motion_viewer.py --help` for available arguments.
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/__init__.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/__init__.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+"""
+AMP Motion Loader and motion files.
+"""
+
+from .motion_loader import MotionLoader
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_dance.npz
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_dance.npz
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_run.npz
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_run.npz
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_walk.npz
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/humanoid_walk.npz
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/motion_loader.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/motion_loader.py
--- a/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/motion_viewer.py
+++ b/source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/motions/motion_viewer.py
+# Copyright (c) 2022-2025, The Isaac Lab Project Developers.
+# All rights reserved.
+#
+# SPDX-License-Identifier: BSD-3-Clause
+
+import matplotlib
+import matplotlib.animation
+import matplotlib.pyplot as plt
+import numpy as np
+import torch
+
+import mpl_toolkits.mplot3d  # noqa: F401
+from motion_loader import MotionLoader
+
+
+class MotionViewer:
+    """
+    Helper class to visualize motion data from NumPy-file format.
+    """
+
+    def __init__(self, motion_file: str, device: torch.device | str = "cpu", render_scene: bool = False) -> None:
+        """Load a motion file and initialize the internal variables.
+
+        Args:
+            motion_file: Motion file path to load.
+            device: The device to which to load the data.
+            render_scene: Whether the scene (space occupied by the skeleton during movement)
+                is rendered instead of a reduced view of the skeleton.
+
+        Raises:
+            AssertionError: If the specified motion file doesn't exist.
+        """
+        self._figure = None
+        self._figure_axes = None
+        self._render_scene = render_scene
+
+        # load motions
+        self._motion_loader = MotionLoader(motion_file=motion_file, device=device)
+
+        self._num_frames = self._motion_loader.num_frames
+        self._current_frame = 0
+        self._body_positions = self._motion_loader.body_positions.cpu().numpy()
+
+        print("\nBody")
+        for i, name in enumerate(self._motion_loader.body_names):
+            minimum = np.min(self._body_positions[:, i], axis=0).round(decimals=2)
+            maximum = np.max(self._body_positions[:, i], axis=0).round(decimals=2)
+            print(f"  |-- [{name}] minimum position: {minimum}, maximum position: {maximum}")
+
+    def _drawing_callback(self, frame: int) -> None:
+        """Drawing callback called each frame"""
+        # get current motion frame
+        # get data
+        vertices = self._body_positions[self._current_frame]
+        # draw skeleton state
+        self._figure_axes.clear()
+        self._figure_axes.scatter(*vertices.T, color="black", depthshade=False)
+        # adjust exes according to motion view
+        # - scene
+        if self._render_scene:
+            # compute axes limits
+            minimum = np.min(self._body_positions.reshape(-1, 3), axis=0)
+            maximum = np.max(self._body_positions.reshape(-1, 3), axis=0)
+            center = 0.5 * (maximum + minimum)
+            diff = 0.75 * (maximum - minimum)
+        # - skeleton
+        else:
+            # compute axes limits
+            minimum = np.min(vertices, axis=0)
+            maximum = np.max(vertices, axis=0)
+            center = 0.5 * (maximum + minimum)
+            diff = np.array([0.75 * np.max(maximum - minimum).item()] * 3)
+        # scale view
+        self._figure_axes.set_xlim((center[0] - diff[0], center[0] + diff[0]))
+        self._figure_axes.set_ylim((center[1] - diff[1], center[1] + diff[1]))
+        self._figure_axes.set_zlim((center[2] - diff[2], center[2] + diff[2]))
+        self._figure_axes.set_box_aspect(aspect=diff / diff[0])
+        # plot ground plane
+        x, y = np.meshgrid([center[0] - diff[0], center[0] + diff[0]], [center[1] - diff[1], center[1] + diff[1]])
+        self._figure_axes.plot_surface(x, y, np.zeros_like(x), color="green", alpha=0.2)
+        # print metadata
+        self._figure_axes.set_xlabel("X")
+        self._figure_axes.set_ylabel("Y")
+        self._figure_axes.set_zlabel("Z")
+        self._figure_axes.set_title(f"frame: {self._current_frame}/{self._num_frames}")
+        # increase frame counter
+        self._current_frame += 1
+        if self._current_frame >= self._num_frames:
+            self._current_frame = 0
+
+    def show(self) -> None:
+        """Show motion"""
+        # create a 3D figure
+        self._figure = plt.figure()
+        self._figure_axes = self._figure.add_subplot(projection="3d")
+        # matplotlib animation (the instance must live as long as the animation will run)
+        self._animation = matplotlib.animation.FuncAnimation(
+            fig=self._figure,
+            func=self._drawing_callback,
+            frames=self._num_frames,
+            interval=1000 * self._motion_loader.dt,
+        )
+        plt.show()
+
+
+if __name__ == "__main__":
+    import argparse
+
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--file", type=str, required=True, help="Motion file")
+    parser.add_argument(
+        "--render-scene",
+        action="store_true",
+        default=False,
+        help=(
+            "Whether the scene (space occupied by the skeleton during movement) is rendered instead of a reduced view"
+            " of the skeleton."
+        ),
+    )
+    parser.add_argument("--matplotlib-backend", type=str, default="TkAgg", help="Matplotlib interactive backend")
+    args, _ = parser.parse_known_args()
+
+    # https://matplotlib.org/stable/users/explain/figure/backends.html#interactive-backends
+    matplotlib.use(args.matplotlib_backend)
+
+    viewer = MotionViewer(args.file, render_scene=args.render_scene)
+    viewer.show()