-
ooctipus authored
# Description This PR provides remake and extension to orginal environment kuka-allegro-reorientation implemented in paper: DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training (https://arxiv.org/abs/2305.12127) [Aleksei Petrenko](https://arxiv.org/search/cs?searchtype=author&query=Petrenko,+A), [Arthur Allshire](https://arxiv.org/search/cs?searchtype=author&query=Allshire,+A), [Gavriel State](https://arxiv.org/search/cs?searchtype=author&query=State,+G), [Ankur Handa](https://arxiv.org/search/cs?searchtype=author&query=Handa,+A), [Viktor Makoviychuk](https://arxiv.org/search/cs?searchtype=author&query=Makoviychuk,+V) and another environment kuka-allegro-lift implemented in paper: Visuomotor Policies to Grasp Anything with Dexterous Hands (https://dextrah-rgb.github.io/) [Ritvik Singh](https://www.ritvik-singh.com/), [Arthur Allshire](https://allshire.org/), [Ankur Handa](https://ankurhanda.github.io/), [Nathan Ratliff](https://www.nathanratliff.com/), [Karl Van Wyk](https://scholar.google.com/citations?user=TCYAoF8AAAAJ&hl=en) Though this is a remake, this remake ends up differs quite a lot in environment details for reasons like: 1. Simplify reward structure, 2. Unify environment implemtation, 3. Standarize mdp, 4. Utilizes manager-based API That in my opinion, makes environment study and extension more accessible, and analyzable. For example you can train lift policy first then continuing the checkpoint in reorientation environment, since they share the observation space. : )) It is a best to consider this a very careful re-interpretation rather than exact execution to migrate them to IsaacLab Here is the training curve if you just train with `./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Dexsuite-Kuka-Allegro-Lift-v0 --num_envs 8192 --headless` `./isaaclab.sh -p -m torch.distributed.run --nnodes=1 --nproc_per_node=4 scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Dexsuite-Kuka-Allegro-Reorient-v0 --num_envs 40960 --headless --distributed` lift training ~ 4 hours reorientation training ~ 2 days Note that it requires a order of magnitude more data and time for reorientation to converge compare to lift under almost identical setup training curve(screen captured from Wandb) - reward, Cyan: reorient, Purple: Lift <img width="1487" height="780" alt="Screenshot from 2025-09-07 22-58-13" src="https://github.com/user-attachments/assets/bfa911de-4fee-4c0d-b39c-e9c33fae28f4" /> video results lift   reorient   Memo: I really enjoy working on this remake, and hopefully for whoever plan to play and extend on this remake find it helpful and similarily joyful as I did. I will be very excited to see what you got : )) Octi CAUTION: Do Not Merge until the asset is uploaded to S3 bucket! Fixes # (issue) <!-- As you go through the list, delete the ones that are not applicable. --> - New feature (non-breaking change which adds functionality) ## Screenshots Please attach before and after screenshots of the change if applicable. <!-- Example: | Before | After | | ------ | ----- | | _gif/png before_ | _gif/png after_ | To upload images to a PR -- simply drag and drop an image while in edit mode and it should upload the image directly. You can then paste that source into the above before/after sections. --> ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there <!-- As you go through the checklist above, you can mark something as done by putting an x character in it For example, - [x] I have done this task - [ ] I have not done this task -->
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| _static | ||
| api | ||
| deployment | ||
| experimental-features | ||
| features | ||
| how-to | ||
| migration | ||
| overview | ||
| policy_deployment | ||
| refs | ||
| setup | ||
| tutorials |