Unverified Commit c7dde1b7 authored by ooctipus's avatar ooctipus Committed by GitHub

Adds dexterous lift and reorientation manipulation environments (#3378)

# Description

This PR provides remake and extension to orginal environment
kuka-allegro-reorientation implemented in paper:
DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with
Population Based Training
(https://arxiv.org/abs/2305.12127)
[Aleksei
Petrenko](https://arxiv.org/search/cs?searchtype=author&query=Petrenko,+A),
[Arthur
Allshire](https://arxiv.org/search/cs?searchtype=author&query=Allshire,+A),
[Gavriel
State](https://arxiv.org/search/cs?searchtype=author&query=State,+G),
[Ankur
Handa](https://arxiv.org/search/cs?searchtype=author&query=Handa,+A),
[Viktor
Makoviychuk](https://arxiv.org/search/cs?searchtype=author&query=Makoviychuk,+V)

and another environment kuka-allegro-lift implemented in paper:
Visuomotor Policies to Grasp Anything with Dexterous Hands
(https://dextrah-rgb.github.io/)
[Ritvik Singh](https://www.ritvik-singh.com/), [Arthur
Allshire](https://allshire.org/), [Ankur
Handa](https://ankurhanda.github.io/), [Nathan
Ratliff](https://www.nathanratliff.com/), [Karl Van
Wyk](https://scholar.google.com/citations?user=TCYAoF8AAAAJ&hl=en)


Though this is a remake, this remake ends up differs quite a lot in
environment details for reasons like:
1. Simplify reward structure,
2. Unify environment implemtation,
3. Standarize mdp,
4. Utilizes manager-based API

That in my opinion, makes environment study and extension more
accessible, and analyzable. For example you can train lift policy first
then continuing the checkpoint in reorientation environment, since they
share the observation space. : ))

It is a best to consider this a very careful re-interpretation rather
than exact execution to migrate them to IsaacLab

Here is the training curve if you just train with
`./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Lift-v0 --num_envs 8192 --headless`

`./isaaclab.sh -p -m torch.distributed.run --nnodes=1 --nproc_per_node=4
scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Reorient-v0 --num_envs 40960 --headless
--distributed`

lift training ~ 4 hours
reorientation training ~ 2 days

Note that it requires a order of magnitude more data and time for
reorientation to converge compare to lift under almost identical setup

training curve(screen captured from Wandb) - reward, 
Cyan: reorient, Purple: Lift
<img width="1487" height="780" alt="Screenshot from 2025-09-07 22-58-13"
src="https://github.com/user-attachments/assets/bfa911de-4fee-4c0d-b39c-e9c33fae28f4"
/>


video results 
lift

![cone_lift](https://github.com/user-attachments/assets/e626eadb-b281-4ec9-af16-57f626fcc6aa)

![fat_capsule_lift](https://github.com/user-attachments/assets/cde57d4c-ceb2-40ab-88dd-44320da689c5)

reorient

![cube_reorient](https://github.com/user-attachments/assets/752809cb-ea19-4701-b124-20c1909e4566)

![rod_reorient](https://github.com/user-attachments/assets/f009605a-d93c-491c-b124-ff08606c63ec)


Memo:
I really enjoy working on this remake, and hopefully for whoever plan to
play and extend on this remake find it helpful and similarily joyful as
I did. I will be very excited to see what you got : ))

Octi


CAUTION:
Do Not Merge until the asset is uploaded to S3 bucket!

Fixes # (issue)
<!-- As you go through the list, delete the ones that are not
applicable. -->

- New feature (non-breaking change which adds functionality)

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
parent de9e8ce0
This diff is collapsed.
......@@ -18,6 +18,7 @@ from .galbot import *
from .humanoid import *
from .humanoid_28 import *
from .kinova import *
from .kuka_allegro import *
from .pick_and_place import *
from .quadcopter import *
from .ridgeback_franka import *
......
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Configuration for the Kuka-lbr-iiwa arm robots and Allegro Hand.
The following configurations are available:
* :obj:`KUKA_ALLEGRO_CFG`: Kuka Allegro with implicit actuator model.
Reference:
* https://www.kuka.com/en-us/products/robotics-systems/industrial-robots/lbr-iiwa
* https://www.wonikrobotics.com/robot-hand
"""
import isaaclab.sim as sim_utils
from isaaclab.actuators.actuator_cfg import ImplicitActuatorCfg
from isaaclab.assets.articulation import ArticulationCfg
from isaaclab.utils.assets import ISAACLAB_NUCLEUS_DIR
##
# Configuration
##
KUKA_ALLEGRO_CFG = ArticulationCfg(
spawn=sim_utils.UsdFileCfg(
usd_path=f"{ISAACLAB_NUCLEUS_DIR}/Robots/KukaAllegro/kuka.usd",
activate_contact_sensors=True,
rigid_props=sim_utils.RigidBodyPropertiesCfg(
disable_gravity=True,
retain_accelerations=True,
linear_damping=0.0,
angular_damping=0.0,
max_linear_velocity=1000.0,
max_angular_velocity=1000.0,
max_depenetration_velocity=1000.0,
),
articulation_props=sim_utils.ArticulationRootPropertiesCfg(
enabled_self_collisions=True,
solver_position_iteration_count=32,
solver_velocity_iteration_count=1,
sleep_threshold=0.005,
stabilization_threshold=0.0005,
),
joint_drive_props=sim_utils.JointDrivePropertiesCfg(drive_type="force"),
),
init_state=ArticulationCfg.InitialStateCfg(
pos=(0.0, 0.0, 0.0),
rot=(1.0, 0.0, 0.0, 0.0),
joint_pos={
"iiwa7_joint_(1|2|7)": 0.0,
"iiwa7_joint_3": 0.7854,
"iiwa7_joint_4": 1.5708,
"iiwa7_joint_(5|6)": -1.5708,
"(index|middle|ring)_joint_0": 0.0,
"(index|middle|ring)_joint_1": 0.3,
"(index|middle|ring)_joint_2": 0.3,
"(index|middle|ring)_joint_3": 0.3,
"thumb_joint_0": 1.5,
"thumb_joint_1": 0.60147215,
"thumb_joint_2": 0.33795027,
"thumb_joint_3": 0.60845138,
},
),
actuators={
"kuka_allegro_actuators": ImplicitActuatorCfg(
joint_names_expr=[
"iiwa7_joint_(1|2|3|4|5|6|7)",
"index_joint_(0|1|2|3)",
"middle_joint_(0|1|2|3)",
"ring_joint_(0|1|2|3)",
"thumb_joint_(0|1|2|3)",
],
effort_limit_sim={
"iiwa7_joint_(1|2|3|4|5|6|7)": 300.0,
"index_joint_(0|1|2|3)": 0.5,
"middle_joint_(0|1|2|3)": 0.5,
"ring_joint_(0|1|2|3)": 0.5,
"thumb_joint_(0|1|2|3)": 0.5,
},
stiffness={
"iiwa7_joint_(1|2|3|4)": 300.0,
"iiwa7_joint_5": 100.0,
"iiwa7_joint_6": 50.0,
"iiwa7_joint_7": 25.0,
"index_joint_(0|1|2|3)": 3.0,
"middle_joint_(0|1|2|3)": 3.0,
"ring_joint_(0|1|2|3)": 3.0,
"thumb_joint_(0|1|2|3)": 3.0,
},
damping={
"iiwa7_joint_(1|2|3|4)": 45.0,
"iiwa7_joint_5": 20.0,
"iiwa7_joint_6": 15.0,
"iiwa7_joint_7": 15.0,
"index_joint_(0|1|2|3)": 0.1,
"middle_joint_(0|1|2|3)": 0.1,
"ring_joint_(0|1|2|3)": 0.1,
"thumb_joint_(0|1|2|3)": 0.1,
},
friction={
"iiwa7_joint_(1|2|3|4|5|6|7)": 1.0,
"index_joint_(0|1|2|3)": 0.01,
"middle_joint_(0|1|2|3)": 0.01,
"ring_joint_(0|1|2|3)": 0.01,
"thumb_joint_(0|1|2|3)": 0.01,
},
),
},
soft_joint_pos_limit_factor=1.0,
)
[package]
# Note: Semantic Versioning is used: https://semver.org/
version = "0.10.51"
version = "0.11.0"
# Description
title = "Isaac Lab Environments"
......
Changelog
---------
0.11.0 (2025-09-07)
~~~~~~~~~~~~~~~~~~~~
Added
^^^^^
* Added dextrous lifting and dextrous reorientation manipulation rl environments.
0.10.51 (2025-09-08)
~~~~~~~~~~~~~~~~~~~~
......
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Dexsuite environments.
Implementation Reference:
Reorient:
@article{petrenko2023dexpbt,
title={Dexpbt: Scaling up dexterous manipulation for hand-arm systems with population based training},
author={Petrenko, Aleksei and Allshire, Arthur and State, Gavriel and Handa, Ankur and Makoviychuk, Viktor},
journal={arXiv preprint arXiv:2305.12127},
year={2023}
}
Lift:
@article{singh2024dextrah,
title={Dextrah-rgb: Visuomotor policies to grasp anything with dexterous hands},
author={Singh, Ritvik and Allshire, Arthur and Handa, Ankur and Ratliff, Nathan and Van Wyk, Karl},
journal={arXiv preprint arXiv:2412.01791},
year={2024}
}
"""
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.managers import CurriculumTermCfg as CurrTerm
from isaaclab.utils import configclass
from . import mdp
@configclass
class CurriculumCfg:
"""Curriculum terms for the MDP."""
# adr stands for automatic/adaptive domain randomization
adr = CurrTerm(
func=mdp.DifficultyScheduler, params={"init_difficulty": 0, "min_difficulty": 0, "max_difficulty": 10}
)
joint_pos_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.joint_pos.noise.n_min",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.1, "difficulty_term_str": "adr"},
},
)
joint_pos_unoise_max_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.joint_pos.noise.n_max",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": 0.1, "difficulty_term_str": "adr"},
},
)
joint_vel_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.joint_vel.noise.n_min",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.2, "difficulty_term_str": "adr"},
},
)
joint_vel_unoise_max_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.joint_vel.noise.n_max",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": 0.2, "difficulty_term_str": "adr"},
},
)
hand_tips_pos_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.hand_tips_state_b.noise.n_min",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.01, "difficulty_term_str": "adr"},
},
)
hand_tips_pos_unoise_max_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.proprio.hand_tips_state_b.noise.n_max",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": 0.01, "difficulty_term_str": "adr"},
},
)
object_quat_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.policy.object_quat_b.noise.n_min",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.03, "difficulty_term_str": "adr"},
},
)
object_quat_unoise_max_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.policy.object_quat_b.noise.n_max",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": 0.03, "difficulty_term_str": "adr"},
},
)
object_obs_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.perception.object_point_cloud.noise.n_min",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.01, "difficulty_term_str": "adr"},
},
)
object_obs_unoise_max_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.perception.object_point_cloud.noise.n_max",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {"initial_value": 0.0, "final_value": -0.01, "difficulty_term_str": "adr"},
},
)
gravity_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "events.variable_gravity.params.gravity_distribution_params",
"modify_fn": mdp.initial_final_interpolate_fn,
"modify_params": {
"initial_value": ((0.0, 0.0, 0.0), (0.0, 0.0, 0.0)),
"final_value": ((0.0, 0.0, -9.81), (0.0, 0.0, -9.81)),
"difficulty_term_str": "adr",
},
},
)
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Configurations for the dexsuite environments."""
# We leave this file empty since we don't want to expose any configs in this package directly.
# We still need this file to import the "config" module in the parent package.
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""
Dextra Kuka Allegro environments.
"""
import gymnasium as gym
from . import agents
##
# Register Gym environments.
##
# State Observation
gym.register(
id="Isaac-Dexsuite-Kuka-Allegro-Reorient-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.dexsuite_kuka_allegro_env_cfg:DexsuiteKukaAllegroReorientEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:DexsuiteKukaAllegroPPORunnerCfg",
},
)
gym.register(
id="Isaac-Dexsuite-Kuka-Allegro-Reorient-Play-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.dexsuite_kuka_allegro_env_cfg:DexsuiteKukaAllegroReorientEnvCfg_PLAY",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:DexsuiteKukaAllegroPPORunnerCfg",
},
)
# Dexsuite Lift Environments
gym.register(
id="Isaac-Dexsuite-Kuka-Allegro-Lift-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.dexsuite_kuka_allegro_env_cfg:DexsuiteKukaAllegroLiftEnvCfg",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:DexsuiteKukaAllegroPPORunnerCfg",
},
)
gym.register(
id="Isaac-Dexsuite-Kuka-Allegro-Lift-Play-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.dexsuite_kuka_allegro_env_cfg:DexsuiteKukaAllegroLiftEnvCfg_PLAY",
"rl_games_cfg_entry_point": f"{agents.__name__}:rl_games_ppo_cfg.yaml",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:DexsuiteKukaAllegroPPORunnerCfg",
},
)
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
params:
seed: 42
# environment wrapper clipping
env:
clip_observations: 100.0
clip_actions: 100.0
obs_groups:
obs: ["policy", "proprio", "perception"]
states: ["policy", "proprio", "perception"]
concate_obs_groups: True
algo:
name: a2c_continuous
model:
name: continuous_a2c_logstd
network:
name: actor_critic
separate: True
space:
continuous:
mu_activation: None
sigma_activation: None
mu_init:
name: default
sigma_init:
name: const_initializer
val: 0
fixed_sigma: True
mlp:
units: [512, 256, 128]
activation: elu
d2rl: False
initializer:
name: default
regularizer:
name: None
load_checkpoint: False # flag which sets whether to load the checkpoint
load_path: '' # path to the checkpoint to load
config:
name: reorient
env_name: rlgpu
device: 'cuda:0'
device_name: 'cuda:0'
multi_gpu: False
ppo: True
mixed_precision: False
normalize_input: True
normalize_value: True
value_bootstrap: False
num_actors: -1
reward_shaper:
scale_value: 0.01
normalize_advantage: True
gamma: 0.99
tau: 0.95
learning_rate: 1e-3
lr_schedule: adaptive
schedule_type: legacy
kl_threshold: 0.01
score_to_win: 100000000
max_epochs: 750000
save_best_after: 100
save_frequency: 50
print_stats: True
grad_norm: 1.0
entropy_coef: 0.001
truncate_grads: True
e_clip: 0.2
horizon_length: 36
minibatch_size: 36864
mini_epochs: 5
critic_coef: 4
clip_value: True
clip_actions: False
seq_len: 4
bounds_loss_coef: 0.0001
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.utils import configclass
from isaaclab_rl.rsl_rl import RslRlOnPolicyRunnerCfg, RslRlPpoActorCriticCfg, RslRlPpoAlgorithmCfg
@configclass
class DexsuiteKukaAllegroPPORunnerCfg(RslRlOnPolicyRunnerCfg):
num_steps_per_env = 32
obs_groups = {"policy": ["policy", "proprio", "perception"], "critic": ["policy", "proprio", "perception"]}
max_iterations = 15000
save_interval = 250
experiment_name = "dexsuite_kuka_allegro"
policy = RslRlPpoActorCriticCfg(
init_noise_std=1.0,
actor_obs_normalization=True,
critic_obs_normalization=True,
actor_hidden_dims=[512, 256, 128],
critic_hidden_dims=[512, 256, 128],
activation="elu",
)
algorithm = RslRlPpoAlgorithmCfg(
value_loss_coef=1.0,
use_clipped_value_loss=True,
clip_param=0.2,
entropy_coef=0.005,
num_learning_epochs=5,
num_mini_batches=4,
learning_rate=1.0e-3,
schedule="adaptive",
gamma=0.99,
lam=0.95,
desired_kl=0.01,
max_grad_norm=1.0,
)
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab_assets.robots import KUKA_ALLEGRO_CFG
from isaaclab.managers import ObservationTermCfg as ObsTerm
from isaaclab.managers import RewardTermCfg as RewTerm
from isaaclab.managers import SceneEntityCfg
from isaaclab.sensors import ContactSensorCfg
from isaaclab.utils import configclass
from ... import dexsuite_env_cfg as dexsuite
from ... import mdp
@configclass
class KukaAllegroRelJointPosActionCfg:
action = mdp.RelativeJointPositionActionCfg(asset_name="robot", joint_names=[".*"], scale=0.1)
@configclass
class KukaAllegroReorientRewardCfg(dexsuite.RewardsCfg):
# bool awarding term if 2 finger tips are in contact with object, one of the contacting fingers has to be thumb.
good_finger_contact = RewTerm(
func=mdp.contacts,
weight=0.5,
params={"threshold": 1.0},
)
@configclass
class KukaAllegroMixinCfg:
rewards: KukaAllegroReorientRewardCfg = KukaAllegroReorientRewardCfg()
actions: KukaAllegroRelJointPosActionCfg = KukaAllegroRelJointPosActionCfg()
def __post_init__(self: dexsuite.DexsuiteReorientEnvCfg):
super().__post_init__()
self.commands.object_pose.body_name = "palm_link"
self.scene.robot = KUKA_ALLEGRO_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
finger_tip_body_list = ["index_link_3", "middle_link_3", "ring_link_3", "thumb_link_3"]
for link_name in finger_tip_body_list:
setattr(
self.scene,
f"{link_name}_object_s",
ContactSensorCfg(
prim_path="{ENV_REGEX_NS}/Robot/ee_link/" + link_name,
filter_prim_paths_expr=["{ENV_REGEX_NS}/Object"],
),
)
self.observations.proprio.contact = ObsTerm(
func=mdp.fingers_contact_force_b,
params={"contact_sensor_names": [f"{link}_object_s" for link in finger_tip_body_list]},
clip=(-20.0, 20.0), # contact force in finger tips is under 20N normally
)
self.observations.proprio.hand_tips_state_b.params["body_asset_cfg"].body_names = ["palm_link", ".*_tip"]
self.rewards.fingers_to_object.params["asset_cfg"] = SceneEntityCfg("robot", body_names=["palm_link", ".*_tip"])
@configclass
class DexsuiteKukaAllegroReorientEnvCfg(KukaAllegroMixinCfg, dexsuite.DexsuiteReorientEnvCfg):
pass
@configclass
class DexsuiteKukaAllegroReorientEnvCfg_PLAY(KukaAllegroMixinCfg, dexsuite.DexsuiteReorientEnvCfg_PLAY):
pass
@configclass
class DexsuiteKukaAllegroLiftEnvCfg(KukaAllegroMixinCfg, dexsuite.DexsuiteLiftEnvCfg):
pass
@configclass
class DexsuiteKukaAllegroLiftEnvCfg_PLAY(KukaAllegroMixinCfg, dexsuite.DexsuiteLiftEnvCfg_PLAY):
pass
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.envs.mdp import * # noqa: F401, F403
from .commands import * # noqa: F401, F403
from .curriculums import * # noqa: F401, F403
from .observations import * # noqa: F401, F403
from .rewards import * # noqa: F401, F403
from .terminations import * # noqa: F401, F403
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from .pose_commands_cfg import * # noqa: F401, F403
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Sub-module containing command generators for pose tracking."""
from __future__ import annotations
import torch
from collections.abc import Sequence
from typing import TYPE_CHECKING
from isaaclab.assets import Articulation, RigidObject
from isaaclab.managers import CommandTerm
from isaaclab.markers import VisualizationMarkers
from isaaclab.utils.math import combine_frame_transforms, compute_pose_error, quat_from_euler_xyz, quat_unique
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedEnv
from . import pose_commands_cfg as dex_cmd_cfgs
class ObjectUniformPoseCommand(CommandTerm):
"""Uniform pose command generator for an object (in the robot base frame).
This command term samples target object poses by:
• Drawing (x, y, z) uniformly within configured Cartesian bounds, and
• Drawing roll-pitch-yaw uniformly within configured ranges, then converting
to a quaternion (w, x, y, z). Optionally makes quaternions unique by enforcing
a positive real part.
Frames:
Targets are defined in the robot's *base frame*. For metrics/visualization,
targets are transformed into the *world frame* using the robot root pose.
Outputs:
The command buffer has shape (num_envs, 7): `(x, y, z, qw, qx, qy, qz)`.
Metrics:
`position_error` and `orientation_error` are computed between the commanded
world-frame pose and the object's current world-frame pose.
Config:
`cfg` must provide the sampling ranges, whether to enforce quaternion uniqueness,
and optional visualization settings.
"""
cfg: dex_cmd_cfgs.ObjectUniformPoseCommandCfg
"""Configuration for the command generator."""
def __init__(self, cfg: dex_cmd_cfgs.ObjectUniformPoseCommandCfg, env: ManagerBasedEnv):
"""Initialize the command generator class.
Args:
cfg: The configuration parameters for the command generator.
env: The environment object.
"""
# initialize the base class
super().__init__(cfg, env)
# extract the robot and body index for which the command is generated
self.robot: Articulation = env.scene[cfg.asset_name]
self.object: RigidObject = env.scene[cfg.object_name]
self.success_vis_asset: RigidObject = env.scene[cfg.success_vis_asset_name]
# create buffers
# -- commands: (x, y, z, qw, qx, qy, qz) in root frame
self.pose_command_b = torch.zeros(self.num_envs, 7, device=self.device)
self.pose_command_b[:, 3] = 1.0
self.pose_command_w = torch.zeros_like(self.pose_command_b)
# -- metrics
self.metrics["position_error"] = torch.zeros(self.num_envs, device=self.device)
self.metrics["orientation_error"] = torch.zeros(self.num_envs, device=self.device)
self.success_visualizer = VisualizationMarkers(self.cfg.success_visualizer_cfg)
self.success_visualizer.set_visibility(True)
def __str__(self) -> str:
msg = "UniformPoseCommand:\n"
msg += f"\tCommand dimension: {tuple(self.command.shape[1:])}\n"
msg += f"\tResampling time range: {self.cfg.resampling_time_range}\n"
return msg
"""
Properties
"""
@property
def command(self) -> torch.Tensor:
"""The desired pose command. Shape is (num_envs, 7).
The first three elements correspond to the position, followed by the quaternion orientation in (w, x, y, z).
"""
return self.pose_command_b
"""
Implementation specific functions.
"""
def _update_metrics(self):
# transform command from base frame to simulation world frame
self.pose_command_w[:, :3], self.pose_command_w[:, 3:] = combine_frame_transforms(
self.robot.data.root_pos_w,
self.robot.data.root_quat_w,
self.pose_command_b[:, :3],
self.pose_command_b[:, 3:],
)
# compute the error
pos_error, rot_error = compute_pose_error(
self.pose_command_w[:, :3],
self.pose_command_w[:, 3:],
self.object.data.root_state_w[:, :3],
self.object.data.root_state_w[:, 3:7],
)
self.metrics["position_error"] = torch.norm(pos_error, dim=-1)
self.metrics["orientation_error"] = torch.norm(rot_error, dim=-1)
success_id = self.metrics["position_error"] < 0.05
if not self.cfg.position_only:
success_id &= self.metrics["orientation_error"] < 0.5
self.success_visualizer.visualize(self.success_vis_asset.data.root_pos_w, marker_indices=success_id.int())
def _resample_command(self, env_ids: Sequence[int]):
# sample new pose targets
# -- position
r = torch.empty(len(env_ids), device=self.device)
self.pose_command_b[env_ids, 0] = r.uniform_(*self.cfg.ranges.pos_x)
self.pose_command_b[env_ids, 1] = r.uniform_(*self.cfg.ranges.pos_y)
self.pose_command_b[env_ids, 2] = r.uniform_(*self.cfg.ranges.pos_z)
# -- orientation
euler_angles = torch.zeros_like(self.pose_command_b[env_ids, :3])
euler_angles[:, 0].uniform_(*self.cfg.ranges.roll)
euler_angles[:, 1].uniform_(*self.cfg.ranges.pitch)
euler_angles[:, 2].uniform_(*self.cfg.ranges.yaw)
quat = quat_from_euler_xyz(euler_angles[:, 0], euler_angles[:, 1], euler_angles[:, 2])
# make sure the quaternion has real part as positive
self.pose_command_b[env_ids, 3:] = quat_unique(quat) if self.cfg.make_quat_unique else quat
def _update_command(self):
pass
def _set_debug_vis_impl(self, debug_vis: bool):
# create markers if necessary for the first tome
if debug_vis:
if not hasattr(self, "goal_visualizer"):
# -- goal pose
self.goal_visualizer = VisualizationMarkers(self.cfg.goal_pose_visualizer_cfg)
# -- current body pose
self.curr_visualizer = VisualizationMarkers(self.cfg.curr_pose_visualizer_cfg)
# set their visibility to true
self.goal_visualizer.set_visibility(True)
self.curr_visualizer.set_visibility(True)
else:
if hasattr(self, "goal_visualizer"):
self.goal_visualizer.set_visibility(False)
self.curr_visualizer.set_visibility(False)
def _debug_vis_callback(self, event):
# check if robot is initialized
# note: this is needed in-case the robot is de-initialized. we can't access the data
if not self.robot.is_initialized:
return
# update the markers
if not self.cfg.position_only:
# -- goal pose
self.goal_visualizer.visualize(self.pose_command_w[:, :3], self.pose_command_w[:, 3:])
# -- current object pose
self.curr_visualizer.visualize(self.object.data.root_pos_w, self.object.data.root_quat_w)
else:
distance = torch.norm(self.pose_command_w[:, :3] - self.object.data.root_pos_w[:, :3], dim=1)
success_id = (distance < 0.05).int()
# note: since marker indices for position is 1(far) and 2(near), we can simply shift the success_id by 1.
# -- goal position
self.goal_visualizer.visualize(self.pose_command_w[:, :3], marker_indices=success_id + 1)
# -- current object position
self.curr_visualizer.visualize(self.object.data.root_pos_w, marker_indices=success_id + 1)
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from dataclasses import MISSING
import isaaclab.sim as sim_utils
from isaaclab.managers import CommandTermCfg
from isaaclab.markers import VisualizationMarkersCfg
from isaaclab.utils import configclass
from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
from . import pose_commands as dex_cmd
ALIGN_MARKER_CFG = VisualizationMarkersCfg(
markers={
"frame": sim_utils.UsdFileCfg(
usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/UIElements/frame_prim.usd",
scale=(0.1, 0.1, 0.1),
),
"position_far": sim_utils.SphereCfg(
radius=0.01,
visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(1.0, 0.0, 0.0)),
),
"position_near": sim_utils.SphereCfg(
radius=0.01,
visual_material=sim_utils.PreviewSurfaceCfg(diffuse_color=(0.0, 1.0, 0.0)),
),
}
)
@configclass
class ObjectUniformPoseCommandCfg(CommandTermCfg):
"""Configuration for uniform pose command generator."""
class_type: type = dex_cmd.ObjectUniformPoseCommand
asset_name: str = MISSING
"""Name of the coordinate referencing asset in the environment for which the commands are generated respect to."""
object_name: str = MISSING
"""Name of the object in the environment for which the commands are generated."""
make_quat_unique: bool = False
"""Whether to make the quaternion unique or not. Defaults to False.
If True, the quaternion is made unique by ensuring the real part is positive.
"""
@configclass
class Ranges:
"""Uniform distribution ranges for the pose commands."""
pos_x: tuple[float, float] = MISSING
"""Range for the x position (in m)."""
pos_y: tuple[float, float] = MISSING
"""Range for the y position (in m)."""
pos_z: tuple[float, float] = MISSING
"""Range for the z position (in m)."""
roll: tuple[float, float] = MISSING
"""Range for the roll angle (in rad)."""
pitch: tuple[float, float] = MISSING
"""Range for the pitch angle (in rad)."""
yaw: tuple[float, float] = MISSING
"""Range for the yaw angle (in rad)."""
ranges: Ranges = MISSING
"""Ranges for the commands."""
position_only: bool = True
"""Command goal position only. Command includes goal quat if False"""
# Pose Markers
goal_pose_visualizer_cfg: VisualizationMarkersCfg = ALIGN_MARKER_CFG.replace(prim_path="/Visuals/Command/goal_pose")
"""The configuration for the goal pose visualization marker. Defaults to FRAME_MARKER_CFG."""
curr_pose_visualizer_cfg: VisualizationMarkersCfg = ALIGN_MARKER_CFG.replace(prim_path="/Visuals/Command/body_pose")
"""The configuration for the current pose visualization marker. Defaults to FRAME_MARKER_CFG."""
success_vis_asset_name: str = MISSING
"""Name of the asset in the environment for which the success color are indicated."""
# success markers
success_visualizer_cfg = VisualizationMarkersCfg(prim_path="/Visuals/SuccessMarkers", markers={})
"""The configuration for the success visualization marker. User needs to add the markers"""
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
import torch
from collections.abc import Sequence
from typing import TYPE_CHECKING
from isaaclab.assets import Articulation, RigidObject
from isaaclab.envs import mdp
from isaaclab.managers import ManagerTermBase, SceneEntityCfg
from isaaclab.utils.math import combine_frame_transforms, compute_pose_error
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedRLEnv
def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, initial_value, final_value, difficulty_term_str):
"""
Interpolate between initial value iv and final value fv, for any arbitrarily
nested structure of lists/tuples in 'data'. Scalars (int/float) are handled
at the leaves.
"""
# get the fraction scalar on the device
difficulty_term: DifficultyScheduler = getattr(env.curriculum_manager.cfg, difficulty_term_str).func
frac = difficulty_term.difficulty_frac
if frac < 0.1:
# no-op during start, since the difficulty fraction near 0 is wasting of resource.
return mdp.modify_env_param.NO_CHANGE
# convert iv/fv to tensors, but we'll peel them apart in recursion
initial_value_tensor = torch.tensor(initial_value, device=env.device)
final_value_tensor = torch.tensor(final_value, device=env.device)
return _recurse(initial_value_tensor.tolist(), final_value_tensor.tolist(), data, frac)
def _recurse(iv_elem, fv_elem, data_elem, frac):
# If it's a sequence, rebuild the same type with each element recursed
if isinstance(data_elem, Sequence) and not isinstance(data_elem, (str, bytes)):
# Note: we assume initial value element and final value element have the same structure as data
return type(data_elem)(_recurse(iv_e, fv_e, d_e, frac) for iv_e, fv_e, d_e in zip(iv_elem, fv_elem, data_elem))
# Otherwise it's a leaf scalar: do the interpolation
new_val = frac * (fv_elem - iv_elem) + iv_elem
if isinstance(data_elem, int):
return int(new_val.item())
else:
# cast floats or any numeric
return new_val.item()
class DifficultyScheduler(ManagerTermBase):
"""Adaptive difficulty scheduler for curriculum learning.
Tracks per-environment difficulty levels and adjusts them based on task performance. Difficulty increases when
position/orientation errors fall below given tolerances, and decreases otherwise (unless `promotion_only` is set).
The normalized average difficulty across environments is exposed as `difficulty_frac` for use in curriculum
interpolation.
Args:
cfg: Configuration object specifying scheduler parameters.
env: The manager-based RL environment.
"""
def __init__(self, cfg, env):
super().__init__(cfg, env)
init_difficulty = self.cfg.params.get("init_difficulty", 0)
self.current_adr_difficulties = torch.ones(env.num_envs, device=env.device) * init_difficulty
self.difficulty_frac = 0
def get_state(self):
return self.current_adr_difficulties
def set_state(self, state: torch.Tensor):
self.current_adr_difficulties = state.clone().to(self._env.device)
def __call__(
self,
env: ManagerBasedRLEnv,
env_ids: Sequence[int],
asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
pos_tol: float = 0.1,
rot_tol: float | None = None,
init_difficulty: int = 0,
min_difficulty: int = 0,
max_difficulty: int = 50,
promotion_only: bool = False,
):
asset: Articulation = env.scene[asset_cfg.name]
object: RigidObject = env.scene[object_cfg.name]
command = env.command_manager.get_command("object_pose")
des_pos_w, des_quat_w = combine_frame_transforms(
asset.data.root_pos_w[env_ids], asset.data.root_quat_w[env_ids], command[env_ids, :3], command[env_ids, 3:7]
)
pos_err, rot_err = compute_pose_error(
des_pos_w, des_quat_w, object.data.root_pos_w[env_ids], object.data.root_quat_w[env_ids]
)
pos_dist = torch.norm(pos_err, dim=1)
rot_dist = torch.norm(rot_err, dim=1)
move_up = (pos_dist < pos_tol) & (rot_dist < rot_tol) if rot_tol else pos_dist < pos_tol
demot = self.current_adr_difficulties[env_ids] if promotion_only else self.current_adr_difficulties[env_ids] - 1
self.current_adr_difficulties[env_ids] = torch.where(
move_up,
self.current_adr_difficulties[env_ids] + 1,
demot,
).clamp(min=min_difficulty, max=max_difficulty)
self.difficulty_frac = torch.mean(self.current_adr_difficulties) / max(max_difficulty, 1)
return self.difficulty_frac
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
import torch
from typing import TYPE_CHECKING
from isaaclab.assets import Articulation, RigidObject
from isaaclab.managers import ManagerTermBase, SceneEntityCfg
from isaaclab.utils.math import quat_apply, quat_apply_inverse, quat_inv, quat_mul, subtract_frame_transforms
from .utils import sample_object_point_cloud
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedRLEnv
def object_pos_b(
env: ManagerBasedRLEnv,
robot_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
):
"""Object position in the robot's root frame.
Args:
env: The environment.
robot_cfg: Scene entity for the robot (reference frame). Defaults to ``SceneEntityCfg("robot")``.
object_cfg: Scene entity for the object. Defaults to ``SceneEntityCfg("object")``.
Returns:
Tensor of shape ``(num_envs, 3)``: object position [x, y, z] expressed in the robot root frame.
"""
robot: RigidObject = env.scene[robot_cfg.name]
object: RigidObject = env.scene[object_cfg.name]
return quat_apply_inverse(robot.data.root_quat_w, object.data.root_pos_w - robot.data.root_pos_w)
def object_quat_b(
env: ManagerBasedRLEnv,
robot_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
) -> torch.Tensor:
"""Object orientation in the robot's root frame.
Args:
env: The environment.
robot_cfg: Scene entity for the robot (reference frame). Defaults to ``SceneEntityCfg("robot")``.
object_cfg: Scene entity for the object. Defaults to ``SceneEntityCfg("object")``.
Returns:
Tensor of shape ``(num_envs, 4)``: object quaternion ``(w, x, y, z)`` in the robot root frame.
"""
robot: RigidObject = env.scene[robot_cfg.name]
object: RigidObject = env.scene[object_cfg.name]
return quat_mul(quat_inv(robot.data.root_quat_w), object.data.root_quat_w)
def body_state_b(
env: ManagerBasedRLEnv,
body_asset_cfg: SceneEntityCfg,
base_asset_cfg: SceneEntityCfg,
) -> torch.Tensor:
"""Body state (pos, quat, lin vel, ang vel) in the base asset's root frame.
The state for each body is stacked horizontally as
``[position(3), quaternion(4)(wxyz), linvel(3), angvel(3)]`` and then concatenated over bodies.
Args:
env: The environment.
body_asset_cfg: Scene entity for the articulated body whose links are observed.
base_asset_cfg: Scene entity providing the reference (root) frame.
Returns:
Tensor of shape ``(num_envs, num_bodies * 13)`` with per-body states expressed in the base root frame.
"""
body_asset: Articulation = env.scene[body_asset_cfg.name]
base_asset: Articulation = env.scene[base_asset_cfg.name]
# get world pose of bodies
body_pos_w = body_asset.data.body_pos_w[:, body_asset_cfg.body_ids].view(-1, 3)
body_quat_w = body_asset.data.body_quat_w[:, body_asset_cfg.body_ids].view(-1, 4)
body_lin_vel_w = body_asset.data.body_lin_vel_w[:, body_asset_cfg.body_ids].view(-1, 3)
body_ang_vel_w = body_asset.data.body_ang_vel_w[:, body_asset_cfg.body_ids].view(-1, 3)
num_bodies = int(body_pos_w.shape[0] / env.num_envs)
# get world pose of base frame
root_pos_w = base_asset.data.root_link_pos_w.unsqueeze(1).repeat_interleave(num_bodies, dim=1).view(-1, 3)
root_quat_w = base_asset.data.root_link_quat_w.unsqueeze(1).repeat_interleave(num_bodies, dim=1).view(-1, 4)
# transform from world body pose to local body pose
body_pos_b, body_quat_b = subtract_frame_transforms(root_pos_w, root_quat_w, body_pos_w, body_quat_w)
body_lin_vel_b = quat_apply_inverse(root_quat_w, body_lin_vel_w)
body_ang_vel_b = quat_apply_inverse(root_quat_w, body_ang_vel_w)
# concate and return
out = torch.cat((body_pos_b, body_quat_b, body_lin_vel_b, body_ang_vel_b), dim=1)
return out.view(env.num_envs, -1)
class object_point_cloud_b(ManagerTermBase):
"""Object surface point cloud expressed in a reference asset's root frame.
Points are pre-sampled on the object's surface in its local frame and transformed to world,
then into the reference (e.g., robot) root frame. Optionally visualizes the points.
Args (from ``cfg.params``):
object_cfg: Scene entity for the object to sample. Defaults to ``SceneEntityCfg("object")``.
ref_asset_cfg: Scene entity providing the reference frame. Defaults to ``SceneEntityCfg("robot")``.
num_points: Number of points to sample on the object surface. Defaults to ``10``.
visualize: Whether to draw markers for the points. Defaults to ``True``.
static: If ``True``, cache world-space points on reset and reuse them (no per-step resampling).
Returns (from ``__call__``):
If ``flatten=False``: tensor of shape ``(num_envs, num_points, 3)``.
If ``flatten=True``: tensor of shape ``(num_envs, 3 * num_points)``.
"""
def __init__(self, cfg, env: ManagerBasedRLEnv):
super().__init__(cfg, env)
self.object_cfg: SceneEntityCfg = cfg.params.get("object_cfg", SceneEntityCfg("object"))
self.ref_asset_cfg: SceneEntityCfg = cfg.params.get("ref_asset_cfg", SceneEntityCfg("robot"))
num_points: int = cfg.params.get("num_points", 10)
self.object: RigidObject = env.scene[self.object_cfg.name]
self.ref_asset: Articulation = env.scene[self.ref_asset_cfg.name]
# lazy initialize visualizer and point cloud
if cfg.params.get("visualize", True):
from isaaclab.markers import VisualizationMarkers
from isaaclab.markers.config import RAY_CASTER_MARKER_CFG
ray_cfg = RAY_CASTER_MARKER_CFG.replace(prim_path="/Visuals/ObservationPointCloud")
ray_cfg.markers["hit"].radius = 0.0025
self.visualizer = VisualizationMarkers(ray_cfg)
self.points_local = sample_object_point_cloud(
env.num_envs, num_points, self.object.cfg.prim_path, device=env.device
)
self.points_w = torch.zeros_like(self.points_local)
def __call__(
self,
env: ManagerBasedRLEnv,
ref_asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
num_points: int = 10,
flatten: bool = False,
visualize: bool = True,
):
"""Compute the object point cloud in the reference asset's root frame.
Note:
Points are pre-sampled at initialization using ``self.num_points``; the ``num_points`` argument is
kept for API symmetry and does not change the sampled set at runtime.
Args:
env: The environment.
ref_asset_cfg: Reference frame provider (root). Defaults to ``SceneEntityCfg("robot")``.
object_cfg: Object to sample. Defaults to ``SceneEntityCfg("object")``.
num_points: Unused at runtime; see note above.
flatten: If ``True``, return a flattened tensor ``(num_envs, 3 * num_points)``.
visualize: If ``True``, draw markers for the points.
Returns:
Tensor of shape ``(num_envs, num_points, 3)`` or flattened if requested.
"""
ref_pos_w = self.ref_asset.data.root_pos_w.unsqueeze(1).repeat(1, num_points, 1)
ref_quat_w = self.ref_asset.data.root_quat_w.unsqueeze(1).repeat(1, num_points, 1)
object_pos_w = self.object.data.root_pos_w.unsqueeze(1).repeat(1, num_points, 1)
object_quat_w = self.object.data.root_quat_w.unsqueeze(1).repeat(1, num_points, 1)
# apply rotation + translation
self.points_w = quat_apply(object_quat_w, self.points_local) + object_pos_w
if visualize:
self.visualizer.visualize(translations=self.points_w.view(-1, 3))
object_point_cloud_pos_b, _ = subtract_frame_transforms(ref_pos_w, ref_quat_w, self.points_w, None)
return object_point_cloud_pos_b.view(env.num_envs, -1) if flatten else object_point_cloud_pos_b
def fingers_contact_force_b(
env: ManagerBasedRLEnv,
contact_sensor_names: list[str],
asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
) -> torch.Tensor:
"""base-frame contact forces from listed sensors, concatenated per env.
Args:
env: The environment.
contact_sensor_names: Names of contact sensors in ``env.scene.sensors`` to read.
Returns:
Tensor of shape ``(num_envs, 3 * num_sensors)`` with forces stacked horizontally as
``[fx, fy, fz]`` per sensor.
"""
force_w = [env.scene.sensors[name].data.force_matrix_w.view(env.num_envs, 3) for name in contact_sensor_names]
force_w = torch.stack(force_w, dim=1)
robot: Articulation = env.scene[asset_cfg.name]
forces_b = quat_apply_inverse(robot.data.root_link_quat_w.unsqueeze(1).repeat(1, force_w.shape[1], 1), force_w)
return forces_b
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
import torch
from typing import TYPE_CHECKING
from isaaclab.assets import RigidObject
from isaaclab.managers import SceneEntityCfg
from isaaclab.sensors import ContactSensor
from isaaclab.utils import math as math_utils
from isaaclab.utils.math import combine_frame_transforms, compute_pose_error
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedRLEnv
def action_rate_l2_clamped(env: ManagerBasedRLEnv) -> torch.Tensor:
"""Penalize the rate of change of the actions using L2 squared kernel."""
return torch.sum(torch.square(env.action_manager.action - env.action_manager.prev_action), dim=1).clamp(-1000, 1000)
def action_l2_clamped(env: ManagerBasedRLEnv) -> torch.Tensor:
"""Penalize the actions using L2 squared kernel."""
return torch.sum(torch.square(env.action_manager.action), dim=1).clamp(-1000, 1000)
def object_ee_distance(
env: ManagerBasedRLEnv,
std: float,
object_cfg: SceneEntityCfg = SceneEntityCfg("object"),
asset_cfg: SceneEntityCfg = SceneEntityCfg("robot"),
) -> torch.Tensor:
"""Reward reaching the object using a tanh-kernel on end-effector distance.
The reward is close to 1 when the maximum distance between the object and any end-effector body is small.
"""
asset: RigidObject = env.scene[asset_cfg.name]
object: RigidObject = env.scene[object_cfg.name]
asset_pos = asset.data.body_pos_w[:, asset_cfg.body_ids]
object_pos = object.data.root_pos_w
object_ee_distance = torch.norm(asset_pos - object_pos[:, None, :], dim=-1).max(dim=-1).values
return 1 - torch.tanh(object_ee_distance / std)
def contacts(env: ManagerBasedRLEnv, threshold: float) -> torch.Tensor:
"""Penalize undesired contacts as the number of violations that are above a threshold."""
thumb_contact_sensor: ContactSensor = env.scene.sensors["thumb_link_3_object_s"]
index_contact_sensor: ContactSensor = env.scene.sensors["index_link_3_object_s"]
middle_contact_sensor: ContactSensor = env.scene.sensors["middle_link_3_object_s"]
ring_contact_sensor: ContactSensor = env.scene.sensors["ring_link_3_object_s"]
# check if contact force is above threshold
thumb_contact = thumb_contact_sensor.data.force_matrix_w.view(env.num_envs, 3)
index_contact = index_contact_sensor.data.force_matrix_w.view(env.num_envs, 3)
middle_contact = middle_contact_sensor.data.force_matrix_w.view(env.num_envs, 3)
ring_contact = ring_contact_sensor.data.force_matrix_w.view(env.num_envs, 3)
thumb_contact_mag = torch.norm(thumb_contact, dim=-1)
index_contact_mag = torch.norm(index_contact, dim=-1)
middle_contact_mag = torch.norm(middle_contact, dim=-1)
ring_contact_mag = torch.norm(ring_contact, dim=-1)
good_contact_cond1 = (thumb_contact_mag > threshold) & (
(index_contact_mag > threshold) | (middle_contact_mag > threshold) | (ring_contact_mag > threshold)
)
return good_contact_cond1
def success_reward(
env: ManagerBasedRLEnv,
command_name: str,
asset_cfg: SceneEntityCfg,
align_asset_cfg: SceneEntityCfg,
pos_std: float,
rot_std: float | None = None,
) -> torch.Tensor:
"""Reward success by comparing commanded pose to the object pose using tanh kernels on error."""
asset: RigidObject = env.scene[asset_cfg.name]
object: RigidObject = env.scene[align_asset_cfg.name]
command = env.command_manager.get_command(command_name)
des_pos_w, des_quat_w = combine_frame_transforms(
asset.data.root_pos_w, asset.data.root_quat_w, command[:, :3], command[:, 3:7]
)
pos_err, rot_err = compute_pose_error(des_pos_w, des_quat_w, object.data.root_pos_w, object.data.root_quat_w)
pos_dist = torch.norm(pos_err, dim=1)
if not rot_std:
# square is not necessary but this help to keep the final value between having rot_std or not roughly the same
return (1 - torch.tanh(pos_dist / pos_std)) ** 2
rot_dist = torch.norm(rot_err, dim=1)
return (1 - torch.tanh(pos_dist / pos_std)) * (1 - torch.tanh(rot_dist / rot_std))
def position_command_error_tanh(
env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg, align_asset_cfg: SceneEntityCfg
) -> torch.Tensor:
"""Reward tracking of commanded position using tanh kernel, gated by contact presence."""
asset: RigidObject = env.scene[asset_cfg.name]
object: RigidObject = env.scene[align_asset_cfg.name]
command = env.command_manager.get_command(command_name)
# obtain the desired and current positions
des_pos_b = command[:, :3]
des_pos_w, _ = combine_frame_transforms(asset.data.root_pos_w, asset.data.root_quat_w, des_pos_b)
distance = torch.norm(object.data.root_pos_w - des_pos_w, dim=1)
return (1 - torch.tanh(distance / std)) * contacts(env, 1.0).float()
def orientation_command_error_tanh(
env: ManagerBasedRLEnv, std: float, command_name: str, asset_cfg: SceneEntityCfg, align_asset_cfg: SceneEntityCfg
) -> torch.Tensor:
"""Reward tracking of commanded orientation using tanh kernel, gated by contact presence."""
asset: RigidObject = env.scene[asset_cfg.name]
object: RigidObject = env.scene[align_asset_cfg.name]
command = env.command_manager.get_command(command_name)
# obtain the desired and current orientations
des_quat_b = command[:, 3:7]
des_quat_w = math_utils.quat_mul(asset.data.root_state_w[:, 3:7], des_quat_b)
quat_distance = math_utils.quat_error_magnitude(object.data.root_quat_w, des_quat_w)
return (1 - torch.tanh(quat_distance / std)) * contacts(env, 1.0).float()
# Copyright (c) 2022-2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Common functions that can be used to activate certain terminations for the dexsuite task.
The functions can be passed to the :class:`isaaclab.managers.TerminationTermCfg` object to enable
the termination introduced by the function.
"""
from __future__ import annotations
import torch
from typing import TYPE_CHECKING
from isaaclab.assets import Articulation, RigidObject
from isaaclab.managers import SceneEntityCfg
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedRLEnv
def out_of_bound(
env: ManagerBasedRLEnv,
asset_cfg: SceneEntityCfg = SceneEntityCfg("object"),
in_bound_range: dict[str, tuple[float, float]] = {},
) -> torch.Tensor:
"""Termination condition for the object falls out of bound.
Args:
env: The environment.
asset_cfg: The object configuration. Defaults to SceneEntityCfg("object").
in_bound_range: The range in x, y, z such that the object is considered in range
"""
object: RigidObject = env.scene[asset_cfg.name]
range_list = [in_bound_range.get(key, (0.0, 0.0)) for key in ["x", "y", "z"]]
ranges = torch.tensor(range_list, device=env.device)
object_pos_local = object.data.root_pos_w - env.scene.env_origins
outside_bounds = ((object_pos_local < ranges[:, 0]) | (object_pos_local > ranges[:, 1])).any(dim=1)
return outside_bounds
def abnormal_robot_state(env: ManagerBasedRLEnv, asset_cfg: SceneEntityCfg = SceneEntityCfg("robot")) -> torch.Tensor:
"""Terminating environment when violation of velocity limits detects, this usually indicates unstable physics caused
by very bad, or aggressive action"""
robot: Articulation = env.scene[asset_cfg.name]
return (robot.data.joint_vel.abs() > (robot.data.joint_vel_limits * 2)).any(dim=1)
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment