Unverified Commit 64a97f20 authored by michaellin6's avatar michaellin6 Committed by GitHub

Changes Pink IK solver and null space computation to reduce runtime speed (#3904)

# Description
This PR optimizes the Pink IK controller solver by changing the qpsolver
to use **daqp**, and also optimizing the matrix pseudo inverse in Null
Space Posture Task. This achieves a **2x performance improvement** by
reducing runtime from 1.23 ms to 0.52 ms.

Perf experiments documented in third page here:

https://docs.google.com/document/d/1E9UiYUU-oCOIetUkqAIva8oK0NvdNkMeqP2gxmeqxNA/edit?tab=t.0#heading=h.snu74q2v857w

## Key Changes

1. **Optimized Pseudoinverse Computation**: Replaced `np.linalg.pinv()`
with a custom implementation using direct LAPACK/BLAS calls in the null
space projector calculation. The new approach uses Cholesky
factorization (`dpotrf`) and triangular solvers (`dpotrs`) for faster
computation.

2. **QP Solver Update**: Changed the Pink IK solver from `osqp` to
`daqp` for improved performance.

3. **New Dependency**: Added `daqp==0.7.2` to `setup.py` for Linux
platforms.

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- New feature (non-breaking change which adds functionality)

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

---------
Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
parent 76f5b296
...@@ -292,6 +292,15 @@ Using the Mimic generated data we can now train a state-based BC agent for ``Isa ...@@ -292,6 +292,15 @@ Using the Mimic generated data we can now train a state-based BC agent for ``Isa
Visualizing results Visualizing results
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
.. tip::
**Important: Testing Multiple Checkpoint Epochs**
When evaluating policy performance, it is common for different training epochs to yield significantly different results.
If you don't see the expected performance, **always test policies from various epochs** (not just the final checkpoint)
to find the best-performing model. Model performance can vary substantially across training, and the final epoch
is not always optimal.
By inferencing using the generated model, we can visualize the results of the policy: By inferencing using the generated model, we can visualize the results of the policy:
.. tab-set:: .. tab-set::
...@@ -315,6 +324,11 @@ By inferencing using the generated model, we can visualize the results of the po ...@@ -315,6 +324,11 @@ By inferencing using the generated model, we can visualize the results of the po
--device cpu --enable_cameras --task Isaac-Stack-Cube-Franka-IK-Rel-Visuomotor-v0 --num_rollouts 50 \ --device cpu --enable_cameras --task Isaac-Stack-Cube-Franka-IK-Rel-Visuomotor-v0 --num_rollouts 50 \
--checkpoint /PATH/TO/desired_model_checkpoint.pth --checkpoint /PATH/TO/desired_model_checkpoint.pth
.. tip::
**If you don't see expected performance results:** Test policies from multiple checkpoint epochs, not just the final one.
Policy performance can vary significantly across training epochs, and intermediate checkpoints often outperform the final model.
.. note:: .. note::
**Expected Success Rates and Timings for Franka Cube Stack Task** **Expected Success Rates and Timings for Franka Cube Stack Task**
...@@ -323,6 +337,7 @@ By inferencing using the generated model, we can visualize the results of the po ...@@ -323,6 +337,7 @@ By inferencing using the generated model, we can visualize the results of the po
* Data generation time: ~30 mins for state, ~4 hours for visuomotor (varies based on num envs the user runs) * Data generation time: ~30 mins for state, ~4 hours for visuomotor (varies based on num envs the user runs)
* BC RNN training time: 1000 epochs + ~30 mins (for state), 600 epochs + ~6 hours (for visuomotor) * BC RNN training time: 1000 epochs + ~30 mins (for state), 600 epochs + ~6 hours (for visuomotor)
* BC RNN policy success rate: ~40-60% (for both state + visuomotor) * BC RNN policy success rate: ~40-60% (for both state + visuomotor)
* **Recommendation:** Evaluate checkpoints from various epochs throughout training to identify the best-performing model
Demo 1: Data Generation and Policy Training for a Humanoid Robot Demo 1: Data Generation and Policy Training for a Humanoid Robot
...@@ -513,6 +528,11 @@ Visualize the results of the trained policy by running the following command, us ...@@ -513,6 +528,11 @@ Visualize the results of the trained policy by running the following command, us
.. note:: .. note::
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step. Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
.. tip::
**If you don't see expected performance results:** It is critical to test policies from various checkpoint epochs.
Performance can vary significantly between epochs, and the best-performing checkpoint is often not the final one.
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_steering_wheel_pick_place_policy.gif .. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_steering_wheel_pick_place_policy.gif
:width: 100% :width: 100%
:align: center :align: center
...@@ -528,7 +548,7 @@ Visualize the results of the trained policy by running the following command, us ...@@ -528,7 +548,7 @@ Visualize the results of the trained policy by running the following command, us
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset. * Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
* Data generation success for this task is typically 65-80% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (19 minutes on a RTX ADA 6000 @ 80% success rate). * Data generation success for this task is typically 65-80% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (19 minutes on a RTX ADA 6000 @ 80% success rate).
* Behavior Cloning (BC) policy success is typically 75-86% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 29 minutes on a RTX ADA 6000. * Behavior Cloning (BC) policy success is typically 75-86% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 29 minutes on a RTX ADA 6000.
* Recommendation: Train for 2000 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 1500th and 2000th epochs to select the best-performing policy. * **Recommendation:** Train for 2000 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 1000th and 2000th epochs** to select the best-performing policy. Testing various epochs is essential for finding optimal performance.
Demo 2: Data Generation and Policy Training for Humanoid Robot Locomanipulation with Unitree G1 Demo 2: Data Generation and Policy Training for Humanoid Robot Locomanipulation with Unitree G1
...@@ -642,6 +662,11 @@ Visualize the trained policy performance: ...@@ -642,6 +662,11 @@ Visualize the trained policy performance:
.. note:: .. note::
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step. Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
.. tip::
**If you don't see expected performance results:** Always test policies from various checkpoint epochs.
Different epochs can produce significantly different results, so evaluate multiple checkpoints to find the optimal model.
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/locomanipulation-g-1_steering_wheel_pick_place.gif .. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/locomanipulation-g-1_steering_wheel_pick_place.gif
:width: 100% :width: 100%
:align: center :align: center
...@@ -657,7 +682,7 @@ Visualize the trained policy performance: ...@@ -657,7 +682,7 @@ Visualize the trained policy performance:
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset. * Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
* Data generation success for this task is typically 65-82% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (18 minutes on a RTX ADA 6000 @ 82% success rate). * Data generation success for this task is typically 65-82% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (18 minutes on a RTX ADA 6000 @ 82% success rate).
* Behavior Cloning (BC) policy success is typically 75-85% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 40 minutes on a RTX ADA 6000. * Behavior Cloning (BC) policy success is typically 75-85% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 40 minutes on a RTX ADA 6000.
* Recommendation: Train for 2000 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 1500th and 2000th epochs to select the best-performing policy. * **Recommendation:** Train for 2000 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 1000th and 2000th epochs** to select the best-performing policy. Testing various epochs is essential for finding optimal performance.
Generate the dataset with manipulation and point-to-point navigation Generate the dataset with manipulation and point-to-point navigation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...@@ -851,6 +876,11 @@ Visualize the results of the trained policy by running the following command, us ...@@ -851,6 +876,11 @@ Visualize the results of the trained policy by running the following command, us
.. note:: .. note::
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step. Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
.. tip::
**If you don't see expected performance results:** Test policies from various checkpoint epochs, not just the final one.
Policy performance can vary substantially across training, and intermediate checkpoints often yield better results.
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_nut_pouring_policy.gif .. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_nut_pouring_policy.gif
:width: 100% :width: 100%
:align: center :align: center
...@@ -866,7 +896,7 @@ Visualize the results of the trained policy by running the following command, us ...@@ -866,7 +896,7 @@ Visualize the results of the trained policy by running the following command, us
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset. * Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
* Data generation for 1000 demonstrations takes approximately 10 hours on a RTX ADA 6000. * Data generation for 1000 demonstrations takes approximately 10 hours on a RTX ADA 6000.
* Behavior Cloning (BC) policy success is typically 50-60% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 600 epochs (default). Training takes approximately 15 hours on a RTX ADA 6000. * Behavior Cloning (BC) policy success is typically 50-60% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 600 epochs (default). Training takes approximately 15 hours on a RTX ADA 6000.
* Recommendation: Train for 600 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 300th and 600th epochs to select the best-performing policy. * **Recommendation:** Train for 600 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 300th and 600th epochs** to select the best-performing policy. Testing various epochs is critical for achieving optimal performance.
.. _common-pitfalls-generating-data: .. _common-pitfalls-generating-data:
......
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.47.6" version = "0.47.7"
# Description # Description
title = "Isaac Lab framework for Robot Learning" title = "Isaac Lab framework for Robot Learning"
......
Changelog Changelog
--------- ---------
0.47.7 (2025-10-31)
~~~~~~~~~~~~~~~~~~~
Changed
^^^^^^^
* Changed Pink IK controller qpsolver from osqp to daqp.
* Changed Null Space matrix computation in Pink IK's Null Space Posture Task to a faster matrix pseudo inverse computation.
0.47.6 (2025-11-01) 0.47.6 (2025-11-01)
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
......
...@@ -4,6 +4,8 @@ ...@@ -4,6 +4,8 @@
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import numpy as np import numpy as np
import scipy.linalg.blas as blas
import scipy.linalg.lapack as lapack
import pinocchio as pin import pinocchio as pin
from pink.configuration import Configuration from pink.configuration import Configuration
...@@ -75,6 +77,9 @@ class NullSpacePostureTask(Task): ...@@ -75,6 +77,9 @@ class NullSpacePostureTask(Task):
""" """
# Regularization factor for pseudoinverse computation to ensure numerical stability
PSEUDOINVERSE_DAMPING_FACTOR: float = 1e-9
def __init__( def __init__(
self, self,
cost: float, cost: float,
...@@ -237,6 +242,30 @@ class NullSpacePostureTask(Task): ...@@ -237,6 +242,30 @@ class NullSpacePostureTask(Task):
J_combined = np.concatenate(J_frame_tasks, axis=0) J_combined = np.concatenate(J_frame_tasks, axis=0)
# Compute null space projector: N = I - J^+ * J # Compute null space projector: N = I - J^+ * J
N_combined = np.eye(J_combined.shape[1]) - np.linalg.pinv(J_combined) @ J_combined # Use fast pseudoinverse computation with direct LAPACK/BLAS calls
m, n = J_combined.shape
# Wide matrix (typical for robotics): use left pseudoinverse
# J^+ = J^T @ inv(J @ J^T + λ²I)
# This is faster because we invert an m×m matrix instead of n×n
# Compute J @ J^T using BLAS (faster than numpy)
JJT = blas.dgemm(1.0, J_combined, J_combined.T)
np.fill_diagonal(JJT, JJT.diagonal() + self.PSEUDOINVERSE_DAMPING_FACTOR**2)
# Use LAPACK's Cholesky factorization (dpotrf = Positive definite TRiangular Factorization)
L, info = lapack.dpotrf(JJT, lower=1, clean=False, overwrite_a=True)
if info != 0:
# Fallback if not positive definite: use numpy's pseudoinverse
J_pinv = np.linalg.pinv(J_combined)
return np.eye(n) - J_pinv @ J_combined
# Solve (J @ J^T + λ²I) @ X = J using LAPACK's triangular solver (dpotrs)
# This directly solves the system without computing the full inverse
X, _ = lapack.dpotrs(L, J_combined, lower=1)
# Compute null space projector: N = I - J^T @ X
N_combined = np.eye(n) - J_combined.T @ X
return N_combined return N_combined
...@@ -221,7 +221,7 @@ class PinkIKController: ...@@ -221,7 +221,7 @@ class PinkIKController:
self.pink_configuration, self.pink_configuration,
self.cfg.variable_input_tasks + self.cfg.fixed_input_tasks, self.cfg.variable_input_tasks + self.cfg.fixed_input_tasks,
dt, dt,
solver="osqp", solver="daqp",
safety_break=self.cfg.fail_on_joint_limit_violation, safety_break=self.cfg.fail_on_joint_limit_violation,
) )
joint_angle_changes = velocity * dt joint_angle_changes = velocity * dt
......
...@@ -52,6 +52,7 @@ SUPPORTED_ARCHS = "platform_machine in 'x86_64,AMD64'" ...@@ -52,6 +52,7 @@ SUPPORTED_ARCHS = "platform_machine in 'x86_64,AMD64'"
INSTALL_REQUIRES += [ INSTALL_REQUIRES += [
# required by isaaclab.isaaclab.controllers.pink_ik # required by isaaclab.isaaclab.controllers.pink_ik
f"pin-pink==3.1.0 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS_ARM})", f"pin-pink==3.1.0 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS_ARM})",
f"daqp==0.7.2 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS_ARM})",
# required by isaaclab.devices.openxr.retargeters.humanoid.fourier.gr1_t2_dex_retargeting_utils # required by isaaclab.devices.openxr.retargeters.humanoid.fourier.gr1_t2_dex_retargeting_utils
f"dex-retargeting==0.4.6 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS})", f"dex-retargeting==0.4.6 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS})",
] ]
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment