Unverified Commit 90e5f31a authored by Ashwin Varghese Kuruttukulam's avatar Ashwin Varghese Kuruttukulam Committed by GitHub

Adds task Reach-UR10e, an end-effector tracking environment (#3147)

# Initial Implementation of UR10e Reach Environment for IsaacLab

<!--
Thank you for your interest in sending a pull request. Please make sure
to check the contribution guidelines.

Link:
https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html
-->

This PR introduces a UR10e robot reach environment for IsaacLab,
enabling reinforcement learning-based end-effector pose control using
keypoint-based rewards and domain randomization.

## Summary

Adds a new UR10e reach environment that trains RL agents to control the
robot's end-effector to reach target poses. Uses manager-based RL
framework with 6D keypoint alignment rewards.

### Key Features:
- **UR10e Robot Configuration**: Asset definition for UE10e
- **Keypoint-based Rewards**: 6D pose alignment using multiple keypoints
for precise control
- **Domain Randomization**: Joint position, stiffness, damping, and
friction randomization

## Type of change

- [x] New feature (non-breaking change which adds functionality)
- [x] This change requires a documentation update

## Implementation Details
### Environment Configuration:
- **Observations**: Joint positions, velocities, target pose commands
(19-dim)
- **Actions**: Relative joint position control with 0.0625 scale factor
(6-dim)
- **Rewards**: Keypoint tracking with exponential reward functions
- **Domain Randomization**: Joint offsets (±0.125 rad), stiffness
(0.9-1.1x), damping (0.75-1.5x), friction (0.0-0.1 N⋅m)

### Target Workspace:
- **Position**: Center (0.8875, -0.225, 0.2) ± (0.25, 0.125, 0.1) meters
- **Orientation**: (π, 0, -π/2) ± (π/6, π/6, 2π/3) radians

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

## Usage Example

```python
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Reach-UR10e-v0 --num_envs 1024 --headless
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Reach-UR10e-v0 --num_envs 1 --checkpoint <path_to_checkpoint>
```
Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
parent fb270ab5
...@@ -100,44 +100,48 @@ for the lift-cube environment: ...@@ -100,44 +100,48 @@ for the lift-cube environment:
.. table:: .. table::
:widths: 33 37 30 :widths: 33 37 30
+--------------------+-------------------------+-----------------------------------------------------------------------------+ +----------------------+---------------------------+-----------------------------------------------------------------------------+
| World | Environment ID | Description | | World | Environment ID | Description |
+====================+=========================+=============================================================================+ +======================+===========================+=============================================================================+
| |reach-franka| | |reach-franka-link| | Move the end-effector to a sampled target pose with the Franka robot | | |reach-franka| | |reach-franka-link| | Move the end-effector to a sampled target pose with the Franka robot |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ +----------------------+---------------------------+-----------------------------------------------------------------------------+
| |reach-ur10| | |reach-ur10-link| | Move the end-effector to a sampled target pose with the UR10 robot | | |reach-ur10| | |reach-ur10-link| | Move the end-effector to a sampled target pose with the UR10 robot |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ +----------------------+---------------------------+-----------------------------------------------------------------------------+
| |lift-cube| | |lift-cube-link| | Pick a cube and bring it to a sampled target position with the Franka robot | | |deploy-reach-ur10e| | |deploy-reach-ur10e-link| | Move the end-effector to a sampled target pose with the UR10e robot |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | | | This policy has been deployed to a real robot |
| |stack-cube| | |stack-cube-link| | Stack three cubes (bottom to top: blue, red, green) with the Franka robot. | +----------------------+---------------------------+-----------------------------------------------------------------------------+
| | | Blueprint env used for the NVIDIA Isaac GR00T blueprint for synthetic | | |lift-cube| | |lift-cube-link| | Pick a cube and bring it to a sampled target position with the Franka robot |
| | |stack-cube-bp-link| | manipulation motion generation | +----------------------+---------------------------+-----------------------------------------------------------------------------+
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | |stack-cube| | |stack-cube-link| | Stack three cubes (bottom to top: blue, red, green) with the Franka robot. |
| |cabi-franka| | |cabi-franka-link| | Grasp the handle of a cabinet's drawer and open it with the Franka robot | | | | Blueprint env used for the NVIDIA Isaac GR00T blueprint for synthetic |
| | | | | | |stack-cube-bp-link| | manipulation motion generation |
| | |franka-direct-link| | | +----------------------+---------------------------+-----------------------------------------------------------------------------+
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | |cabi-franka| | |cabi-franka-link| | Grasp the handle of a cabinet's drawer and open it with the Franka robot |
| |cube-allegro| | |cube-allegro-link| | In-hand reorientation of a cube using Allegro hand | | | | |
| | | | | | |franka-direct-link| | |
| | |allegro-direct-link| | | +----------------------+---------------------------+-----------------------------------------------------------------------------+
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | |cube-allegro| | |cube-allegro-link| | In-hand reorientation of a cube using Allegro hand |
| |cube-shadow| | |cube-shadow-link| | In-hand reorientation of a cube using Shadow hand | | | | |
| | | | | | |allegro-direct-link| | |
| | |cube-shadow-ff-link| | | +----------------------+---------------------------+-----------------------------------------------------------------------------+
| | | | | |cube-shadow| | |cube-shadow-link| | In-hand reorientation of a cube using Shadow hand |
| | |cube-shadow-lstm-link| | | | | | |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | | |cube-shadow-ff-link| | |
| |cube-shadow| | |cube-shadow-vis-link| | In-hand reorientation of a cube using Shadow hand using perceptive inputs. | | | | |
| | | Requires running with ``--enable_cameras``. | | | |cube-shadow-lstm-link| | |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ +----------------------+---------------------------+-----------------------------------------------------------------------------+
| |gr1_pick_place| | |gr1_pick_place-link| | Pick up and place an object in a basket with a GR-1 humanoid robot | | |cube-shadow| | |cube-shadow-vis-link| | In-hand reorientation of a cube using Shadow hand using perceptive inputs. |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ | | | Requires running with ``--enable_cameras``. |
| |gr1_pp_waist| | |gr1_pp_waist-link| | Pick up and place an object in a basket with a GR-1 humanoid robot | +----------------------+---------------------------+-----------------------------------------------------------------------------+
| | | with waist degrees-of-freedom enables that provides a wider reach space. | | |gr1_pick_place| | |gr1_pick_place-link| | Pick up and place an object in a basket with a GR-1 humanoid robot |
+--------------------+-------------------------+-----------------------------------------------------------------------------+ +----------------------+---------------------------+-----------------------------------------------------------------------------+
| |gr1_pp_waist| | |gr1_pp_waist-link| | Pick up and place an object in a basket with a GR-1 humanoid robot |
| | | with waist degrees-of-freedom enables that provides a wider reach space. |
+----------------------+---------------------------+-----------------------------------------------------------------------------+
.. |reach-franka| image:: ../_static/tasks/manipulation/franka_reach.jpg .. |reach-franka| image:: ../_static/tasks/manipulation/franka_reach.jpg
.. |reach-ur10| image:: ../_static/tasks/manipulation/ur10_reach.jpg .. |reach-ur10| image:: ../_static/tasks/manipulation/ur10_reach.jpg
.. |deploy-reach-ur10e| image:: ../_static/tasks/manipulation/ur10e_reach.jpg
.. |lift-cube| image:: ../_static/tasks/manipulation/franka_lift.jpg .. |lift-cube| image:: ../_static/tasks/manipulation/franka_lift.jpg
.. |cabi-franka| image:: ../_static/tasks/manipulation/franka_open_drawer.jpg .. |cabi-franka| image:: ../_static/tasks/manipulation/franka_open_drawer.jpg
.. |cube-allegro| image:: ../_static/tasks/manipulation/allegro_cube.jpg .. |cube-allegro| image:: ../_static/tasks/manipulation/allegro_cube.jpg
...@@ -148,6 +152,7 @@ for the lift-cube environment: ...@@ -148,6 +152,7 @@ for the lift-cube environment:
.. |reach-franka-link| replace:: `Isaac-Reach-Franka-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/reach/config/franka/joint_pos_env_cfg.py>`__ .. |reach-franka-link| replace:: `Isaac-Reach-Franka-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/reach/config/franka/joint_pos_env_cfg.py>`__
.. |reach-ur10-link| replace:: `Isaac-Reach-UR10-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/reach/config/ur_10/joint_pos_env_cfg.py>`__ .. |reach-ur10-link| replace:: `Isaac-Reach-UR10-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/reach/config/ur_10/joint_pos_env_cfg.py>`__
.. |deploy-reach-ur10e-link| replace:: `Isaac-Deploy-Reach-UR10e-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/deploy/reach/config/ur_10e/joint_pos_env_cfg.py>`__
.. |lift-cube-link| replace:: `Isaac-Lift-Cube-Franka-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/joint_pos_env_cfg.py>`__ .. |lift-cube-link| replace:: `Isaac-Lift-Cube-Franka-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/joint_pos_env_cfg.py>`__
.. |lift-cube-ik-abs-link| replace:: `Isaac-Lift-Cube-Franka-IK-Abs-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/ik_abs_env_cfg.py>`__ .. |lift-cube-ik-abs-link| replace:: `Isaac-Lift-Cube-Franka-IK-Abs-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/ik_abs_env_cfg.py>`__
.. |lift-cube-ik-rel-link| replace:: `Isaac-Lift-Cube-Franka-IK-Rel-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/ik_rel_env_cfg.py>`__ .. |lift-cube-ik-rel-link| replace:: `Isaac-Lift-Cube-Franka-IK-Rel-v0 <https://github.com/isaac-sim/IsaacLab/blob/main/source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/lift/config/franka/ik_rel_env_cfg.py>`__
...@@ -786,7 +791,7 @@ inferencing, including reading from an already trained checkpoint and disabling ...@@ -786,7 +791,7 @@ inferencing, including reading from an already trained checkpoint and disabling
- -
- Direct - Direct
- -
* - Isaac-Forge-GearMesh-Direct-v0 * - Isaac-Forge-GearMesh-Direct-v0
- -
- Direct - Direct
- **rl_games** (PPO) - **rl_games** (PPO)
...@@ -882,6 +887,10 @@ inferencing, including reading from an already trained checkpoint and disabling ...@@ -882,6 +887,10 @@ inferencing, including reading from an already trained checkpoint and disabling
- Isaac-Reach-UR10-Play-v0 - Isaac-Reach-UR10-Play-v0
- Manager Based - Manager Based
- **rl_games** (PPO), **rsl_rl** (PPO), **skrl** (PPO) - **rl_games** (PPO), **rsl_rl** (PPO), **skrl** (PPO)
* - Isaac-Deploy-Reach-UR10e-v0
- Isaac-Deploy-Reach-UR10e-Play-v0
- Manager Based
- **rsl_rl** (PPO)
* - Isaac-Repose-Cube-Allegro-Direct-v0 * - Isaac-Repose-Cube-Allegro-Direct-v0
- -
- Direct - Direct
......
...@@ -15,7 +15,7 @@ Reference: https://github.com/ros-industrial/universal_robot ...@@ -15,7 +15,7 @@ Reference: https://github.com/ros-industrial/universal_robot
import isaaclab.sim as sim_utils import isaaclab.sim as sim_utils
from isaaclab.actuators import ImplicitActuatorCfg from isaaclab.actuators import ImplicitActuatorCfg
from isaaclab.assets.articulation import ArticulationCfg from isaaclab.assets.articulation import ArticulationCfg
from isaaclab.utils.assets import ISAACLAB_NUCLEUS_DIR from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR, ISAACLAB_NUCLEUS_DIR
## ##
# Configuration # Configuration
...@@ -50,4 +50,55 @@ UR10_CFG = ArticulationCfg( ...@@ -50,4 +50,55 @@ UR10_CFG = ArticulationCfg(
), ),
}, },
) )
UR10e_CFG = ArticulationCfg(
spawn=sim_utils.UsdFileCfg(
usd_path=f"{ISAAC_NUCLEUS_DIR}/Robots/UniversalRobots/ur10e/ur10e.usd",
rigid_props=sim_utils.RigidBodyPropertiesCfg(
disable_gravity=True,
max_depenetration_velocity=5.0,
),
articulation_props=sim_utils.ArticulationRootPropertiesCfg(
enabled_self_collisions=False, solver_position_iteration_count=16, solver_velocity_iteration_count=1
),
activate_contact_sensors=False,
),
init_state=ArticulationCfg.InitialStateCfg(
joint_pos={
"shoulder_pan_joint": 3.141592653589793,
"shoulder_lift_joint": -1.5707963267948966,
"elbow_joint": 1.5707963267948966,
"wrist_1_joint": -1.5707963267948966,
"wrist_2_joint": -1.5707963267948966,
"wrist_3_joint": 0.0,
},
pos=(0.0, 0.0, 0.0),
rot=(1.0, 0.0, 0.0, 0.0),
),
actuators={
# 'shoulder_pan_joint', 'shoulder_lift_joint', 'elbow_joint', 'wrist_1_joint', 'wrist_2_joint', 'wrist_3_joint'
"shoulder": ImplicitActuatorCfg(
joint_names_expr=["shoulder_.*"],
stiffness=1320.0,
damping=72.6636085,
friction=0.0,
armature=0.0,
),
"elbow": ImplicitActuatorCfg(
joint_names_expr=["elbow_joint"],
stiffness=600.0,
damping=34.64101615,
friction=0.0,
armature=0.0,
),
"wrist": ImplicitActuatorCfg(
joint_names_expr=["wrist_.*"],
stiffness=216.0,
damping=29.39387691,
friction=0.0,
armature=0.0,
),
},
)
"""Configuration of UR-10 arm using implicit actuator models.""" """Configuration of UR-10 arm using implicit actuator models."""
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.10.47" version = "0.10.48"
# Description # Description
title = "Isaac Lab Environments" title = "Isaac Lab Environments"
......
Changelog Changelog
--------- ---------
0.10.48 (2025-09-03)
~~~~~~~~~~~~~~~~~~~~
Added
^^^^^
* Added ``Isaac-Deploy-Reach-UR10e-v0`` environment.
0.10.47 (2025-07-25) 0.10.47 (2025-07-25)
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
......
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Deployment environments for manipulation tasks.
These environments are designed for real-world deployment of manipulation tasks.
They containconfigurations and implementations that have been tested
and deployed on physical robots.
The deploy module includes:
- Reach environments for end-effector pose tracking
"""
from .reach import * # noqa: F401, F403
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""This sub-module contains the functions that are specific to the locomotion environments."""
from isaaclab.envs.mdp import * # noqa: F401, F403
from .rewards import * # noqa: F401, F403
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from __future__ import annotations
import torch
from typing import TYPE_CHECKING
from isaacsim.core.utils.torch.transformations import tf_combine
from isaaclab.managers import SceneEntityCfg
from isaaclab.sensors.frame_transformer.frame_transformer import FrameTransformer
if TYPE_CHECKING:
from isaaclab.envs import ManagerBasedRLEnv
def get_keypoint_offsets_full_6d(add_cube_center_kp: bool = False, device: torch.device | None = None) -> torch.Tensor:
"""Get keypoints for pose alignment comparison. Pose is aligned if all axis are aligned.
Args:
add_cube_center_kp: Whether to include the center keypoint (0, 0, 0)
device: Device to create the tensor on
Returns:
Keypoint offsets tensor of shape (num_keypoints, 3)
"""
if add_cube_center_kp:
keypoint_corners = [[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]]
else:
keypoint_corners = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
keypoint_corners = torch.tensor(keypoint_corners, device=device, dtype=torch.float32)
keypoint_corners = torch.cat((keypoint_corners, -keypoint_corners[-3:]), dim=0) # use both negative and positive
return keypoint_corners
def compute_keypoint_distance(
current_pos: torch.Tensor,
current_quat: torch.Tensor,
target_pos: torch.Tensor,
target_quat: torch.Tensor,
keypoint_scale: float = 1.0,
add_cube_center_kp: bool = True,
device: torch.device | None = None,
) -> torch.Tensor:
"""Compute keypoint distance between current and target poses.
This function creates keypoints from the current and target poses and calculates
the L2 norm distance between corresponding keypoints. The keypoints are created
by applying offsets to the poses and transforming them to world coordinates.
Args:
current_pos: Current position tensor of shape (num_envs, 3)
current_quat: Current quaternion tensor of shape (num_envs, 4)
target_pos: Target position tensor of shape (num_envs, 3)
target_quat: Target quaternion tensor of shape (num_envs, 4)
keypoint_scale: Scale factor for keypoint offsets
add_cube_center_kp: Whether to include the center keypoint (0, 0, 0)
device: Device to create tensors on
Returns:
Keypoint distance tensor of shape (num_envs, num_keypoints) where each element
is the L2 norm distance between corresponding keypoints
"""
if device is None:
device = current_pos.device
num_envs = current_pos.shape[0]
# Get keypoint offsets
keypoint_offsets = get_keypoint_offsets_full_6d(add_cube_center_kp, device)
keypoint_offsets = keypoint_offsets * keypoint_scale
num_keypoints = keypoint_offsets.shape[0]
# Create identity quaternion for transformations
identity_quat = torch.tensor([1.0, 0.0, 0.0, 0.0], device=device).unsqueeze(0).repeat(num_envs, 1)
# Initialize keypoint tensors
keypoints_current = torch.zeros((num_envs, num_keypoints, 3), device=device)
keypoints_target = torch.zeros((num_envs, num_keypoints, 3), device=device)
# Compute keypoints for current and target poses
for idx, keypoint_offset in enumerate(keypoint_offsets):
# Transform keypoint offset to world coordinates for current pose
keypoints_current[:, idx] = tf_combine(
current_quat, current_pos, identity_quat, keypoint_offset.repeat(num_envs, 1)
)[1]
# Transform keypoint offset to world coordinates for target pose
keypoints_target[:, idx] = tf_combine(
target_quat, target_pos, identity_quat, keypoint_offset.repeat(num_envs, 1)
)[1]
# Calculate L2 norm distance between corresponding keypoints
keypoint_dist_sep = torch.norm(keypoints_target - keypoints_current, p=2, dim=-1)
return keypoint_dist_sep
def keypoint_command_error(
env: ManagerBasedRLEnv,
command_name: str,
asset_cfg: SceneEntityCfg,
keypoint_scale: float = 1.0,
add_cube_center_kp: bool = True,
) -> torch.Tensor:
"""Compute keypoint distance between current and desired poses from command.
The function computes the keypoint distance between the current pose of the end effector from
the frame transformer sensor and the desired pose from the command. Keypoints are created by
applying offsets to both poses and the distance is computed as the L2-norm between corresponding keypoints.
Args:
env: The environment containing the asset
command_name: Name of the command containing desired pose
asset_cfg: Configuration of the asset to track (not used, kept for compatibility)
keypoint_scale: Scale factor for keypoint offsets
add_cube_center_kp: Whether to include the center keypoint (0, 0, 0)
Returns:
Keypoint distance tensor of shape (num_envs, num_keypoints) where each element
is the L2 norm distance between corresponding keypoints
"""
# extract the frame transformer sensor
asset: FrameTransformer = env.scene[asset_cfg.name]
command = env.command_manager.get_command(command_name)
# obtain the desired pose from command (position and orientation)
des_pos_b = command[:, :3]
des_quat_b = command[:, 3:7]
# transform desired pose to world frame using source frame from frame transformer
des_pos_w = des_pos_b
des_quat_w = des_quat_b
# get current pose in world frame from frame transformer (end effector pose)
curr_pos_w = asset.data.target_pos_source[:, 0] # First target frame is end_effector
curr_quat_w = asset.data.target_quat_source[:, 0] # First target frame is end_effector
# compute keypoint distance
keypoint_dist_sep = compute_keypoint_distance(
current_pos=curr_pos_w,
current_quat=curr_quat_w,
target_pos=des_pos_w,
target_quat=des_quat_w,
keypoint_scale=keypoint_scale,
add_cube_center_kp=add_cube_center_kp,
device=curr_pos_w.device,
)
# Return mean distance across keypoints to match expected reward shape (num_envs,)
return keypoint_dist_sep.mean(-1)
def keypoint_command_error_exp(
env: ManagerBasedRLEnv,
command_name: str,
asset_cfg: SceneEntityCfg,
kp_exp_coeffs: list[tuple[float, float]] = [(1.0, 0.1)],
kp_use_sum_of_exps: bool = True,
keypoint_scale: float = 1.0,
add_cube_center_kp: bool = True,
) -> torch.Tensor:
"""Compute exponential keypoint reward between current and desired poses from command.
The function computes the keypoint distance between the current pose of the end effector from
the frame transformer sensor and the desired pose from the command, then applies an exponential
reward function. The reward is computed using the formula: 1 / (exp(a * distance) + b + exp(-a * distance))
where a and b are coefficients.
Args:
env: The environment containing the asset
command_name: Name of the command containing desired pose
asset_cfg: Configuration of the asset to track (not used, kept for compatibility)
kp_exp_coeffs: List of (a, b) coefficient pairs for exponential reward
kp_use_sum_of_exps: Whether to use sum of exponentials (True) or single exponential (False)
keypoint_scale: Scale factor for keypoint offsets
add_cube_center_kp: Whether to include the center keypoint (0, 0, 0)
Returns:
Exponential keypoint reward tensor of shape (num_envs,) where each element
is the exponential reward value
"""
# extract the frame transformer sensor
asset: FrameTransformer = env.scene[asset_cfg.name]
command = env.command_manager.get_command(command_name)
# obtain the desired pose from command (position and orientation)
des_pos_b = command[:, :3]
des_quat_b = command[:, 3:7]
# transform desired pose to world frame using source frame from frame transformer
des_pos_w = des_pos_b
des_quat_w = des_quat_b
# get current pose in world frame from frame transformer (end effector pose)
curr_pos_w = asset.data.target_pos_source[:, 0] # First target frame is end_effector
curr_quat_w = asset.data.target_quat_source[:, 0] # First target frame is end_effector
# compute keypoint distance
keypoint_dist_sep = compute_keypoint_distance(
current_pos=curr_pos_w,
current_quat=curr_quat_w,
target_pos=des_pos_w,
target_quat=des_quat_w,
keypoint_scale=keypoint_scale,
add_cube_center_kp=add_cube_center_kp,
device=curr_pos_w.device,
)
# compute exponential reward
keypoint_reward_exp = torch.zeros_like(keypoint_dist_sep[:, 0]) # shape: (num_envs,)
if kp_use_sum_of_exps:
# Use sum of exponentials: average across keypoints for each coefficient
for coeff in kp_exp_coeffs:
a, b = coeff
keypoint_reward_exp += (
1.0 / (torch.exp(a * keypoint_dist_sep) + b + torch.exp(-a * keypoint_dist_sep))
).mean(-1)
else:
# Use single exponential: average keypoint distance first, then apply exponential
keypoint_dist = keypoint_dist_sep.mean(-1) # shape: (num_envs,)
for coeff in kp_exp_coeffs:
a, b = coeff
keypoint_reward_exp += 1.0 / (torch.exp(a * keypoint_dist) + b + torch.exp(-a * keypoint_dist))
return keypoint_reward_exp
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""end-effector pose tracking tasks that have been deployed on a real robot."""
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
"""Configuration package for manipulation tasks that have been deployed on a real robot."""
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
import gymnasium as gym
from . import agents
##
# Register Gym environments.
##
gym.register(
id="Isaac-Deploy-Reach-UR10e-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.joint_pos_env_cfg:UR10eReachEnvCfg",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:URReachPPORunnerCfg",
},
)
gym.register(
id="Isaac-Deploy-Reach-UR10e-Play-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.joint_pos_env_cfg:UR10eReachEnvCfg_PLAY",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:URReachPPORunnerCfg",
},
)
gym.register(
id="Isaac-Deploy-Reach-UR10e-ROS-Inference-v0",
entry_point="isaaclab.envs:ManagerBasedRLEnv",
disable_env_checker=True,
kwargs={
"env_cfg_entry_point": f"{__name__}.ros_inference_env_cfg:UR10eReachROSInferenceEnvCfg",
"rsl_rl_cfg_entry_point": f"{agents.__name__}.rsl_rl_ppo_cfg:URReachPPORunnerCfg",
},
)
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.utils import configclass
from isaaclab_rl.rsl_rl import RslRlOnPolicyRunnerCfg, RslRlPpoActorCriticCfg, RslRlPpoAlgorithmCfg
@configclass
class URReachPPORunnerCfg(RslRlOnPolicyRunnerCfg):
num_steps_per_env = 512
max_iterations = 1500
save_interval = 50
experiment_name = "reach_ur10e"
empirical_normalization = True
policy = RslRlPpoActorCriticCfg(
init_noise_std=1.0,
actor_hidden_dims=[256, 128, 64],
critic_hidden_dims=[256, 128, 64],
activation="elu",
)
algorithm = RslRlPpoAlgorithmCfg(
value_loss_coef=1.0,
use_clipped_value_loss=True,
clip_param=0.2,
entropy_coef=0.0,
num_learning_epochs=8,
num_mini_batches=8,
learning_rate=5.0e-4,
schedule="adaptive",
gamma=0.99,
lam=0.95,
desired_kl=0.008,
max_grad_norm=1.0,
)
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
import math
from isaaclab.managers import SceneEntityCfg
from isaaclab.markers.config import FRAME_MARKER_CFG
from isaaclab.sensors.frame_transformer.frame_transformer_cfg import FrameTransformerCfg, OffsetCfg
from isaaclab.utils import configclass
import isaaclab_tasks.manager_based.manipulation.deploy.mdp as mdp
from isaaclab_tasks.manager_based.manipulation.deploy.reach.reach_env_cfg import ReachEnvCfg
##
# Pre-defined configs
##
from isaaclab_assets import UR10e_CFG # isort: skip
##
# Environment configuration
##
@configclass
class UR10eReachEnvCfg(ReachEnvCfg):
def __post_init__(self):
# post init of parent
super().__post_init__()
self.events.robot_joint_stiffness_and_damping.params["asset_cfg"].joint_names = [
"shoulder_.*",
"elbow_.*",
"wrist_.*",
]
self.events.joint_friction.params["asset_cfg"].joint_names = ["shoulder_.*", "elbow_.*", "wrist_.*"]
# switch robot to ur10e
self.scene.robot = UR10e_CFG.replace(prim_path="{ENV_REGEX_NS}/Robot")
# The real UR10e robots polyscore software uses the "base" frame for reference
# But the USD model and UR10e ROS interface uses the "base_link" frame
# We are training this policy to track the end-effector pose in the "base" frame
# The base frame is 180 offset from the base_link frame
# And hence the source_frame_offset is set to 180 degrees around the z-axis
self.rewards.end_effector_keypoint_tracking.params["asset_cfg"] = SceneEntityCfg("ee_frame_wrt_base_frame")
self.rewards.end_effector_keypoint_tracking_exp.params["asset_cfg"] = SceneEntityCfg("ee_frame_wrt_base_frame")
self.scene.ee_frame_wrt_base_frame = FrameTransformerCfg(
prim_path="{ENV_REGEX_NS}/Robot/base_link",
visualizer_cfg=FRAME_MARKER_CFG.replace(prim_path="/Visuals/FrameTransformer"),
source_frame_offset=OffsetCfg(pos=(0.0, 0.0, 0.0), rot=(0.0, 0.0, 0.0, 1.0)),
target_frames=[
FrameTransformerCfg.FrameCfg(
prim_path="{ENV_REGEX_NS}/Robot/wrist_3_link",
name="end_effector",
),
],
)
# Disable visualization for the goal pose because the commands are generated wrt to the base frame
# But the visualization will visualizing it wrt to the base_link frame
self.commands.ee_pose.debug_vis = False
# Incremental joint position action configuration
self.actions.arm_action = mdp.RelativeJointPositionActionCfg(
asset_name="robot", joint_names=[".*"], scale=0.0625, use_zero_offset=True
)
# override command generator body
# end-effector is along x-direction
self.target_pos_centre = (0.8875, -0.225, 0.2)
self.target_pos_range = (0.25, 0.125, 0.1)
self.commands.ee_pose.body_name = "wrist_3_link"
self.commands.ee_pose.ranges.pos_x = (
self.target_pos_centre[0] - self.target_pos_range[0],
self.target_pos_centre[0] + self.target_pos_range[0],
)
self.commands.ee_pose.ranges.pos_y = (
self.target_pos_centre[1] - self.target_pos_range[1],
self.target_pos_centre[1] + self.target_pos_range[1],
)
self.commands.ee_pose.ranges.pos_z = (
self.target_pos_centre[2] - self.target_pos_range[2],
self.target_pos_centre[2] + self.target_pos_range[2],
)
self.target_rot_centre = (math.pi, 0.0, -math.pi / 2) # end-effector facing down
self.target_rot_range = (math.pi / 6, math.pi / 6, math.pi * 2 / 3)
self.commands.ee_pose.ranges.roll = (
self.target_rot_centre[0] - self.target_rot_range[0],
self.target_rot_centre[0] + self.target_rot_range[0],
)
self.commands.ee_pose.ranges.pitch = (
self.target_rot_centre[1] - self.target_rot_range[1],
self.target_rot_centre[1] + self.target_rot_range[1],
)
self.commands.ee_pose.ranges.yaw = (
self.target_rot_centre[2] - self.target_rot_range[2],
self.target_rot_centre[2] + self.target_rot_range[2],
)
@configclass
class UR10eReachEnvCfg_PLAY(UR10eReachEnvCfg):
def __post_init__(self):
# post init of parent
super().__post_init__()
# make a smaller scene for play
self.scene.num_envs = 50
self.scene.env_spacing = 2.5
# disable randomization for play
self.observations.policy.enable_corruption = False
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from isaaclab.utils import configclass
from .joint_pos_env_cfg import UR10eReachEnvCfg
@configclass
class UR10eReachROSInferenceEnvCfg(UR10eReachEnvCfg):
"""Exposing variables for ROS inferences"""
def __post_init__(self):
# post init of parent
super().__post_init__()
# Variables used by Isaac Manipuulator for on robot inference
# TODO: @ashwinvk: Remove these from env cfg once the generic inference node has been implemented
self.obs_order = ["arm_dof_pos", "arm_dof_vel", "target_pos", "target_quat"]
self.policy_action_space = "joint"
self.arm_joint_names = [
"shoulder_pan_joint",
"shoulder_lift_joint",
"elbow_joint",
"wrist_1_joint",
"wrist_2_joint",
"wrist_3_joint",
]
self.policy_action_space = "joint"
self.action_space = 6
self.state_space = 0
self.observation_space = 19
# Set joint_action_scale from the existing arm_action.scale
self.joint_action_scale = self.actions.arm_action.scale
self.action_scale_joint_space = [
self.joint_action_scale,
self.joint_action_scale,
self.joint_action_scale,
self.joint_action_scale,
self.joint_action_scale,
self.joint_action_scale,
]
# Copyright (c) 2025, The Isaac Lab Project Developers (https://github.com/isaac-sim/IsaacLab/blob/main/CONTRIBUTORS.md).
# All rights reserved.
#
# SPDX-License-Identifier: BSD-3-Clause
from dataclasses import MISSING
import isaaclab.sim as sim_utils
from isaaclab.assets import ArticulationCfg, AssetBaseCfg
from isaaclab.envs import ManagerBasedRLEnvCfg
from isaaclab.managers import ActionTermCfg as ActionTerm
from isaaclab.managers import EventTermCfg as EventTerm
from isaaclab.managers import ObservationGroupCfg as ObsGroup
from isaaclab.managers import ObservationTermCfg as ObsTerm
from isaaclab.managers import RewardTermCfg as RewTerm
from isaaclab.managers import SceneEntityCfg
from isaaclab.managers import TerminationTermCfg as DoneTerm
from isaaclab.scene import InteractiveSceneCfg
from isaaclab.utils import configclass
from isaaclab.utils.assets import ISAAC_NUCLEUS_DIR
from isaaclab.utils.noise import AdditiveUniformNoiseCfg as Unoise
import isaaclab_tasks.manager_based.manipulation.deploy.mdp as mdp
##
# Scene definition
##
@configclass
class SceneCfg(InteractiveSceneCfg):
"""Configuration for the scene with a robotic arm."""
# world
ground = AssetBaseCfg(
prim_path="/World/ground",
spawn=sim_utils.GroundPlaneCfg(),
init_state=AssetBaseCfg.InitialStateCfg(pos=(0.0, 0.0, -1.05)),
)
# robots
robot: ArticulationCfg = MISSING
# lights
light = AssetBaseCfg(
prim_path="/World/light",
spawn=sim_utils.DomeLightCfg(color=(0.75, 0.75, 0.75), intensity=2500.0),
)
table = AssetBaseCfg(
prim_path="{ENV_REGEX_NS}/Table",
spawn=sim_utils.UsdFileCfg(
usd_path=f"{ISAAC_NUCLEUS_DIR}/Props/Mounts/Stand/stand_instanceable.usd", scale=(2.0, 2.0, 2.0)
),
)
##
# MDP settings
##
@configclass
class CommandsCfg:
"""Command terms for the MDP."""
ee_pose = mdp.UniformPoseCommandCfg(
asset_name="robot",
body_name=MISSING,
resampling_time_range=(4.0, 4.0),
debug_vis=True,
ranges=mdp.UniformPoseCommandCfg.Ranges(
pos_x=(0.35, 0.65),
pos_y=(-0.2, 0.2),
pos_z=(0.15, 0.5),
roll=(0.0, 0.0),
pitch=MISSING, # depends on end-effector axis
yaw=(-3.14, 3.14),
),
)
@configclass
class ActionsCfg:
"""Action specifications for the MDP."""
arm_action: ActionTerm = MISSING
gripper_action: ActionTerm | None = None
@configclass
class ObservationsCfg:
"""Observation specifications for the MDP."""
@configclass
class PolicyCfg(ObsGroup):
"""Observations for policy group."""
# observation terms (order preserved)
joint_pos = ObsTerm(func=mdp.joint_pos, noise=Unoise(n_min=-0.0, n_max=0.0))
joint_vel = ObsTerm(func=mdp.joint_vel, noise=Unoise(n_min=-0.0, n_max=0.0))
pose_command = ObsTerm(func=mdp.generated_commands, params={"command_name": "ee_pose"})
def __post_init__(self):
self.enable_corruption = True
self.concatenate_terms = True
# observation groups
policy: PolicyCfg = PolicyCfg()
@configclass
class EventCfg:
"""Configuration for events."""
reset_robot_joints = EventTerm(
func=mdp.reset_joints_by_offset,
mode="reset",
params={
"position_range": (-0.125, 0.125),
"velocity_range": (0.0, 0.0),
},
)
robot_joint_stiffness_and_damping = EventTerm(
func=mdp.randomize_actuator_gains,
min_step_count_between_reset=200,
mode="reset",
params={
"asset_cfg": SceneEntityCfg("robot"),
"stiffness_distribution_params": (0.9, 1.1),
"damping_distribution_params": (0.75, 1.5),
"operation": "scale",
"distribution": "uniform",
},
)
joint_friction = EventTerm(
func=mdp.randomize_joint_parameters,
min_step_count_between_reset=200,
mode="reset",
params={
"asset_cfg": SceneEntityCfg("robot"),
"friction_distribution_params": (0.0, 0.1),
"operation": "add",
"distribution": "uniform",
},
)
@configclass
class RewardsCfg:
"""Reward terms for the MDP."""
end_effector_keypoint_tracking = RewTerm(
func=mdp.keypoint_command_error,
weight=-1.5,
params={
"asset_cfg": SceneEntityCfg("ee_frame"),
"command_name": "ee_pose",
"keypoint_scale": 0.45,
},
)
end_effector_keypoint_tracking_exp = RewTerm(
func=mdp.keypoint_command_error_exp,
weight=1.5,
params={
"asset_cfg": SceneEntityCfg("ee_frame"),
"command_name": "ee_pose",
"kp_exp_coeffs": [(50, 0.0001), (300, 0.0001), (5000, 0.0001)],
"kp_use_sum_of_exps": False,
"keypoint_scale": 0.45,
},
)
action_rate = RewTerm(func=mdp.action_rate_l2, weight=-0.005)
action = RewTerm(func=mdp.action_l2, weight=-0.005)
@configclass
class TerminationsCfg:
"""Termination terms for the MDP."""
time_out = DoneTerm(func=mdp.time_out, time_out=True)
##
# Environment configuration
##
@configclass
class ReachEnvCfg(ManagerBasedRLEnvCfg):
"""Configuration for the end-effector pose tracking environment that has been deployed on a real robot."""
# Scene settings
scene: SceneCfg = SceneCfg(num_envs=4096, env_spacing=2.5)
# Basic settings
observations: ObservationsCfg = ObservationsCfg()
actions: ActionsCfg = ActionsCfg()
commands: CommandsCfg = CommandsCfg()
# MDP settings
rewards: RewardsCfg = RewardsCfg()
terminations: TerminationsCfg = TerminationsCfg()
events: EventCfg = EventCfg()
def __post_init__(self):
"""Post initialization."""
# general settings
self.decimation = 2
self.sim.render_interval = self.decimation
self.episode_length_s = 12.0
self.viewer.eye = (3.5, 3.5, 3.5)
# simulation settings
self.sim.dt = 1.0 / 120.0
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment