Commit 8a58a23c authored by peterd-NV's avatar peterd-NV Committed by Kelly Guo

Updates Mimic APIs/configs/docs for future dexmimic compatibility (#216)

# Description

Doc and config changes from @karsten-nvidia:
- Add additional details on custom environments to mimic docs.
- Update comments in mimic configuration to make it easier telling apart
what's important.
- Some minor cleanups in existing docs.
- Add "common pitfalls" section to docs to guide users how to get
successful data generation/training

Mimic API and config changes to support forward dexmimic compatibility:
- Use dictionaries of subtasks in mimic env config; keys are eef_names
- Mimic Env APIs now use dictionary of eef_names to enable mulit-eef
support in future
- Data generation code updated accordingly to use new Mimic env APIs

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

---------
Signed-off-by: 's avatarpeterd-NV <peterd@nvidia.com>
Signed-off-by: 's avatarKelly Guo <kellyg@nvidia.com>
Signed-off-by: 's avatarKelly Guo <kellyguo123@hotmail.com>
Signed-off-by: 's avatarAshwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com>
Co-authored-by: 's avatarCY Chen <cyc@nvidia.com>
Co-authored-by: 's avataroahmednv <oahmed@Nvidia.com>
Co-authored-by: 's avatarToni-SM <aserranomuno@nvidia.com>
Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
Co-authored-by: 's avatarKelly Guo <kellyguo123@hotmail.com>
Co-authored-by: 's avatarrwiltz <165190220+rwiltz@users.noreply.github.com>
Co-authored-by: 's avatarnv-cupright <92540563+nv-cupright@users.noreply.github.com>
Co-authored-by: 's avatarAlexander Poddubny <143108850+nv-apoddubny@users.noreply.github.com>
Co-authored-by: 's avatarchengronglai <chengrongl@nvidia.com>
Co-authored-by: 's avatarDavid Hoeller <dhoeller@nvidia.com>
Co-authored-by: 's avatarmatthewtrepte <mtrepte@nvidia.com>
Co-authored-by: 's avatarAshwin Varghese Kuruttukulam <123109010+ashwinvkNV@users.noreply.github.com>
Co-authored-by: 's avatarKarsten Patzwaldt <kpatzwaldt@nvidia.com>
parent 31f4e9cd
...@@ -39,13 +39,6 @@ Datagen Info Pool ...@@ -39,13 +39,6 @@ Datagen Info Pool
:members: :members:
:inherited-members: :inherited-members:
Selection Strategy
------------------
.. autoclass:: SelectionStrategy
:members:
:inherited-members:
Random Strategy Random Strategy
--------------- ---------------
......
...@@ -68,120 +68,213 @@ Imitation Learning ...@@ -68,120 +68,213 @@ Imitation Learning
Using the teleoperation devices, it is also possible to collect data for Using the teleoperation devices, it is also possible to collect data for
learning from demonstrations (LfD). For this, we provide scripts to collect data into the open HDF5 format. learning from demonstrations (LfD). For this, we provide scripts to collect data into the open HDF5 format.
.. note:: Collecting demonstrations
^^^^^^^^^^^^^^^^^^^^^^^^^
This tutorial assumes you have a ``datasets`` directory under the ``IsaacLab`` repo. Create this directory by running ``cd IsaacLab`` and ``mkdir datasets``.
1. Collect demonstrations with teleoperation for the environment To collect demonstrations with teleoperation for the environment ``Isaac-Stack-Cube-Franka-IK-Rel-v0``, use the following commands:
``Isaac-Stack-Cube-Franka-IK-Rel-v0``:
.. code:: bash .. code:: bash
# step a: collect data with a selected teleoperation device. Replace <teleop_device> with your preferred input device. # step a: create folder for datasets
mkdir -p datasets
# step b: collect data with a selected teleoperation device. Replace <teleop_device> with your preferred input device.
# Available options: spacemouse, keyboard # Available options: spacemouse, keyboard
./isaaclab.sh -p scripts/tools/record_demos.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --teleop_device <teleop_device> --dataset_file ./datasets/dataset.hdf5 --num_demos 10 ./isaaclab.sh -p scripts/tools/record_demos.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --teleop_device <teleop_device> --dataset_file ./datasets/dataset.hdf5 --num_demos 10
# step b: replay the collected dataset # step a: replay the collected dataset
./isaaclab.sh -p scripts/tools/replay_demos.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --dataset_file ./datasets/dataset.hdf5 ./isaaclab.sh -p scripts/tools/replay_demos.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --dataset_file ./datasets/dataset.hdf5
.. note:: .. note::
The order of the stacked cubes should be blue (bottom), red (middle), green (top). The order of the stacked cubes should be blue (bottom), red (middle), green (top).
About 10 successful demonstrations are required in order for the following steps to succeed. About 10 successful demonstrations are required in order for the following steps to succeed.
Here are some tips to perform demonstrations that lead to successful policy training:
* Keep demonstrations short. Shorter demonstrations mean fewer decisions for the policy, making training easier.
* Take a direct path. Do not follow along arbitrary axis, but move straight toward the goal.
* Do not pause. Perform smooth, continuous motions instead. It is not obvious for a policy why and when to pause, hence continuous motions are easier to learn.
If, while performing a demonstration, a mistake is made, or the current demonstration should not be recorded for some other reason, press the ``R`` key to discard the current demonstration, and reset to a new starting position.
Here are some tips to perform demonstrations that lead to successful policy training: .. note::
Non-determinism may be observed during replay as physics in IsaacLab are not determimnistically reproducible when using ``env.reset``.
* Keep demonstrations short. Shorter demonstrations mean fewer decisions for the policy, making training easier.
* Take a direct path. Do not follow along arbitrary axis, but move straight toward the goal.
* Do not pause. Perform smooth, continuous motions instead. It is not obvious for a policy why and when to pause, hence continuous motions are easier to learn.
If, while performing a demonstration, a mistake is made, or the current demonstration should not be recorded for some other reason, press the ``R`` key to discard the current demonstration, and reset to a new starting position. Generating additional demonstrations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2. Generate additional demonstrations using Isaac Lab Mimic Additional demonstrations can be generated using Isaac Lab Mimic.
Isaac Lab Mimic is a feature in Isaac Lab that allows to generate additional demonstrations automatically, allowing a policy to learn successfully even from just a handful of manual demonstrations. Isaac Lab Mimic is a feature in Isaac Lab that allows generation of additional demonstrations automatically, allowing a policy to learn successfully even from just a handful of manual demonstrations.
In order to use Isaac Lab Mimic with the recorded dataset, first annotate the subtasks in the recording: In order to use Isaac Lab Mimic with the recorded dataset, first annotate the subtasks in the recording:
.. code:: bash .. code:: bash
./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/annotate_demos.py --input_file ./datasets/dataset.hdf5 --output_file ./datasets/annotated_dataset.hdf5 --task Isaac-Stack-Cube-Franka-IK-Rel-Mimic-v0 --auto ./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/annotate_demos.py --input_file ./datasets/dataset.hdf5 --output_file ./datasets/annotated_dataset.hdf5 --task Isaac-Stack-Cube-Franka-IK-Rel-Mimic-v0 --auto
Then, use Isaac Lab Mimic to generate some additional demonstrations: Then, use Isaac Lab Mimic to generate some additional demonstrations:
.. code:: bash .. code:: bash
./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/generate_dataset.py --input_file ./datasets/annotated_dataset.hdf5 --output_file ./datasets/generated_dataset_small.hdf5 --num_envs 10 --generation_num_trials 10 ./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/generate_dataset.py --input_file ./datasets/annotated_dataset.hdf5 --output_file ./datasets/generated_dataset_small.hdf5 --num_envs 10 --generation_num_trials 10
.. note:: .. note::
The output_file of the ``annotate_demos.py`` script is the input_file to the ``generate_dataset.py`` script The output_file of the ``annotate_demos.py`` script is the input_file to the ``generate_dataset.py`` script
.. note:: .. note::
Isaac Lab is designed to work with manipulators with grippers. The gripper commands in the demonstrations are extracted separately and temporally replayed during the generation of additional demonstrations. Isaac Lab is designed to work with manipulators with grippers. The gripper commands in the demonstrations are extracted separately and temporally replayed during the generation of additional demonstrations.
Inspect the output of generated data (filename: ``generated_dataset_small.hdf5``), and if satisfactory, generate the full dataset: Inspect the output of generated data (filename: ``generated_dataset_small.hdf5``), and if satisfactory, generate the full dataset:
.. code:: bash .. code:: bash
./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/generate_dataset.py --input_file ./datasets/annotated_dataset.hdf5 --output_file ./datasets/generated_dataset.hdf5 --num_envs 10 --generation_num_trials 1000 --headless ./isaaclab.sh -p scripts/imitation_learning/isaaclab_mimic/generate_dataset.py --input_file ./datasets/annotated_dataset.hdf5 --output_file ./datasets/generated_dataset.hdf5 --num_envs 10 --generation_num_trials 1000 --headless
The number of demonstrations can be increased or decreased, 1000 demonstrations have been shown to provide good training results for this task. The number of demonstrations can be increased or decreased, 1000 demonstrations have been shown to provide good training results for this task.
Additionally, the number of environments in the ``--num_envs`` parameter can be adjusted to speed up data generation. The suggested number of 10 can be executed even on a laptop GPU. On a more powerful desktop machine, set it to 100 or higher for significant speedup of this step.
Additionally, the number of environments in the ``--num_envs`` parameter can be adjusted to speed up data generation. The suggested number of 10 can be executed even on a laptop GPU. On a more powerful desktop machine, set it to 100 or higher for significant speedup of this step. Robomimic setup
^^^^^^^^^^^^^^^
3. Setup robomimic for training a policy As an example, we will train a BC agent implemented in `Robomimic <https://robomimic.github.io/>`__ to train a policy. Any other framework or training method could be used.
As an example, we will train a BC agent implemented in `Robomimic <https://robomimic.github.io/>`__ to train a policy. Any other framework or training method could be used. To install the robomimic framework, use the following commands:
.. code:: bash .. code:: bash
# install the dependencies # install the dependencies
sudo apt install cmake build-essential sudo apt install cmake build-essential
# install python module (for robomimic) # install python module (for robomimic)
./isaaclab.sh -i robomimic ./isaaclab.sh -i robomimic
4. Train a BC agent for ``Isaac-Stack-Cube-Franka-IK-Rel-v0`` using the Mimic generated data: Training an agent
^^^^^^^^^^^^^^^^^
We can now train a BC agent for ``Isaac-Stack-Cube-Franka-IK-Rel-v0`` using the Mimic generated data:
.. code:: bash .. code:: bash
./isaaclab.sh -p scripts/imitation_learning/robomimic/train.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --algo bc --dataset ./datasets/generated_dataset.hdf5 ./isaaclab.sh -p scripts/imitation_learning/robomimic/train.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --algo bc --dataset ./datasets/generated_dataset.hdf5
By default, the training script will save a model checkpoint every 100 epochs. The trained models and logs will be saved to logs/robomimic/Isaac-Stack-Cube-Franka-IK-Rel-v0/bc By default, the training script will save a model checkpoint every 100 epochs. The trained models and logs will be saved to logs/robomimic/Isaac-Stack-Cube-Franka-IK-Rel-v0/bc
Visualizing results
^^^^^^^^^^^^^^^^^^^
By inferencing using the generated model, we can visualize the results of the policy in the same environment:
.. code:: bash
./isaaclab.sh -p scripts/imitation_learning/robomimic/play.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --num_rollouts 50 --checkpoint /PATH/TO/desired_model_checkpoint.pth
Common Pitfalls when Generating Data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Demonstrations are too long:**
* Longer time horizon is harder to learn for a policy
* Start close to the first object and minimize motions
**Demonstrations are not smooth:**
* Irregular motion is hard for policy to decipher
* Better teleop devices result in better data (i.e. SpaceMouse is better than Keyboard)
5. Play the learned model to visualize results: **Pauses in demonstrations:**
.. code:: bash * Pauses are difficult to learn
* Keep the human motions smooth and fluid
**Excessive number of subtasks:**
* Minimize the number of defined subtasks for completing a given task
* Less subtacks results in less stitching of trajectories, yielding higher data generation success rate
**Lack of action noise:**
* Action noise makes policies more robust
**Recording cropped too tight:**
* If recording stops on the frame the success term triggers, it may not re-trigger during replay
* Allow for some buffer at the end of recording
**Non-deterministic replay:**
* Physics in IsaacLab are not deterministically reproducible when using ``env.reset`` so demonstrations may fail on replay
* Collect more human demos than needed, use the ones that succeed during annotation
* All data in Isaac Lab Mimic generated HDF5 file represent a successful demo and can be used for training (even if non-determinism causes failure when replayed)
./isaaclab.sh -p scripts/imitation_learning/robomimic/play.py --task Isaac-Stack-Cube-Franka-IK-Rel-v0 --checkpoint /PATH/TO/desired_model_checkpoint.pth
Creating Your Own Isaac Lab Mimic Compatible Environments Creating Your Own Isaac Lab Mimic Compatible Environments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to use Isaac Lab Mimic to generate additional demonstrations automatically with an existing Isaac Lab environment, the environment How it works
needs to be made "Mimic compatible" by implementing additional functions which are used during data generation. ^^^^^^^^^^^^
Isaac Lab Mimic works by splitting the input demonstrations into subtasks. Subtasks are user-defined segments in the demonstrations that are common to all demonstrations. Examples for subtasks are "grasp an object", "move end effector to some pre-defined position", "release object" etc.. Note that most subtasks are defined with respect to some object that the robot interacts with.
Subtasks need to be defined, and then annotated for each input demonstration. Annotation can either happen algorithmically by defining heuristics for subtask detection, as was done in the example above, or it can be done manually.
With subtasks defined and annotated, Isaac Lab Mimic utilizes a small number of helper methods to then transform the subtask segments, and generate new demonstrations by stitching them together to match the new task at hand.
For each thusly generated candidate demonstration, Isaac Lab Mimic uses a boolean success criteria to determine whether the demonstration succeeded in performing the task, and if so, add it to the output dataset. Success rate of candidate demonstrations can be as high as 70% in simple cases, and as low as <1%, depending on the difficulty of the task, and the complexity of the robot itself.
Configuration and subtask definition
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Subtasks, among other configuration settings for Isaac Lab Mimic, are defined in a Mimic compatible environment configuration class that is created by extending the existing environment config with additional Mimic required parameters.
Mimic compatible environments are derived from the :class:`~isaaclab.envs.ManagerBasedRLMimicEnv` base class and must implement the following functions: All Mimic required config parameters are specified in the :class:`~isaaclab.envs.MimicEnvCfg` class.
The config class :class:`~isaaclab_mimic.envs.FrankaCubeStackIKRelMimicEnvCfg` serves as an example of creating a Mimic compatible environment config class for the Franka stacking task that was used in the examples above.
The ``DataGenConfig`` member contains various parameters that influence how data is generated. It is initially sufficient to just set the ``name`` parameter, and revise the rest later.
Subtasks are a list of ``SubTaskConfig`` objects, of which the most important members are:
* ``object_ref`` is the object that is being interacted with. This will be used to adjust motions relative to this object during data generation. Can be ``None`` if the current subtask does not involve any object.
* ``subtask_term_signal`` is the ID of the signal indicating whether the subtask is active or not.
Subtask annotation
^^^^^^^^^^^^^^^^^^
Once the subtasks are defined, they need to be annotated in the source data. There are two methods to annotate source demonstrations for subtask boundaries: Manual annotation or using heuristics.
It is often easiest to perform manual annotations, since the number of input demonstrations is usually very small. To perform manual annotations, use the ``annotate_demos.py`` script without the ``--auto`` flag. Then press ``B`` to pause, ``N`` to continue, and ``S`` to annotate a subtask boundary.
For more accurate boundaries, or to speed up repeated processing of a given task for experiments, heuristics can be implemented to perform the same task. Heuristics are observations in the environment. An example how to add subtask terms can be found in ``source/isaaclab_tasks/isaaclab_tasks/manager_based/manipulation/stack/stack_env_cfg.py``, where they are added as an observation group called ``SubtaskCfg``. This example is using prebuilt heuristics, but custom heuristics are easily implemented.
Helpers for demonstration generation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Helpers needed for Isaac Lab Mimic are defined in the environment. All tasks that are to be used with Isaac Lab Mimic are derived from the :class:`~isaaclab.envs.ManagerBasedRLMimicEnv` base class, and must implement the following functions:
* ``get_robot_eef_pose``: Returns the current robot end effector pose in the same frame as used by the robot end effector controller. * ``get_robot_eef_pose``: Returns the current robot end effector pose in the same frame as used by the robot end effector controller.
* ``target_eef_pose_to_action``: Takes a target pose for the end effector controller and returns an action which achieves the target pose. * ``target_eef_pose_to_action``: Takes a target pose and a gripper action for the end effector controller and returns an action which achieves the target pose.
* ``action_to_target_eef_pos``: Takes an action and returns a target pose for the end effector controller. * ``action_to_target_eef_pose``: Takes an action and returns a target pose for the end effector controller.
* ``action_to_gripper_action``: Takes an action and returns the gripper actuation part of the action. * ``actions_to_gripper_actions``: Takes a sequence of actions and returns the gripper actuation part of the actions.
* ``get_object_poses``: Returns the pose of each object in the scene that is used for data generation. * ``get_object_poses``: Returns the pose of each object in the scene that is used for data generation.
* ``get_subtask_term_signals``: Returns a dictionary of binary flags for each subtask in a task. The flag of 1 is set when the subtask has been completed and 0 otherwise. * ``get_subtask_term_signals``: Returns a dictionary of binary flags for each subtask in a task. The flag of true is set when the subtask has been completed and false otherwise.
The class :class:`~isaaclab_mimic.envs.FrankaCubeStackIKRelMimicEnv` shows an example of creating a Mimic compatible environment from an existing Isaac Lab environment. The class :class:`~isaaclab_mimic.envs.FrankaCubeStackIKRelMimicEnv` shows an example of creating a Mimic compatible environment from an existing Isaac Lab environment.
A Mimic compatible environment config class must also be created by extending the existing environment config with additional Mimic required parameters. Registering the environment
All Mimic required config parameters are specified in the :class:`~isaaclab.envs.MimicEnvCfg` class. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
The config class :class:`~isaaclab_mimic.envs.FrankaCubeStackIKRelMimicEnvCfg` shows an example of creating a Mimic compatible environment config class for the Franka stacking task.
Once both Mimic compatible environment and environment config classes have been created, a new Mimic compatible environment can be registered using ``gym.register``. For the Franka stacking task in the examples above, the Mimic environment is registered as ``Isaac-Stack-Cube-Franka-IK-Rel-Mimic-v0``.
Once both Mimic compatible environment and environment config classes have been created, a new Mimic compatible environment can be registered using ``gym.register`` and used The registered environment is now ready to be used with Isaac Lab Mimic.
with Isaac Lab Mimic data generation. For the Franka stacking task in the examples above, the Mimic environment is registered as ``Isaac-Stack-Cube-Franka-IK-Rel-Mimic-v0``.
...@@ -55,6 +55,7 @@ import isaaclab_mimic.envs # noqa: F401 ...@@ -55,6 +55,7 @@ import isaaclab_mimic.envs # noqa: F401
# Only enables inputs if this script is NOT headless mode # Only enables inputs if this script is NOT headless mode
if not args_cli.headless and not os.environ.get("HEADLESS", 0): if not args_cli.headless and not os.environ.get("HEADLESS", 0):
from isaaclab.devices import Se3Keyboard from isaaclab.devices import Se3Keyboard
from isaaclab.envs import ManagerBasedRLMimicEnv
from isaaclab.envs.mdp.recorders.recorders_cfg import ActionStateRecorderManagerCfg from isaaclab.envs.mdp.recorders.recorders_cfg import ActionStateRecorderManagerCfg
from isaaclab.managers import RecorderTerm, RecorderTermCfg from isaaclab.managers import RecorderTerm, RecorderTermCfg
from isaaclab.utils import configclass from isaaclab.utils import configclass
...@@ -88,11 +89,16 @@ class PreStepDatagenInfoRecorder(RecorderTerm): ...@@ -88,11 +89,16 @@ class PreStepDatagenInfoRecorder(RecorderTerm):
"""Recorder term that records the datagen info data in each step.""" """Recorder term that records the datagen info data in each step."""
def record_pre_step(self): def record_pre_step(self):
eef_pose_dict = {}
for eef_name in self._env.cfg.subtask_configs.keys():
eef_pose_dict[eef_name] = self._env.get_robot_eef_pose(eef_name)
datagen_info = { datagen_info = {
"object_pose": self._env.scene.get_state(is_relative=True)["rigid_object"], "object_pose": self._env.get_object_poses(),
"target_eef_pose": self._env.action_to_target_eef_pos(self._env.action_manager.action), "eef_pose": eef_pose_dict,
"target_eef_pose": self._env.action_to_target_eef_pose(self._env.action_manager.action),
} }
return "obs", datagen_info return "obs/datagen_info", datagen_info
@configclass @configclass
...@@ -106,7 +112,7 @@ class PreStepSubtaskTermsObservationsRecorder(RecorderTerm): ...@@ -106,7 +112,7 @@ class PreStepSubtaskTermsObservationsRecorder(RecorderTerm):
"""Recorder term that records the subtask completion observations in each step.""" """Recorder term that records the subtask completion observations in each step."""
def record_pre_step(self): def record_pre_step(self):
return "obs/subtask_term_signals", self._env.obs_buf["subtask_terms"] return "obs/datagen_info/subtask_term_signals", self._env.get_subtask_term_signals()
@configclass @configclass
...@@ -164,12 +170,26 @@ def main(): ...@@ -164,12 +170,26 @@ def main():
# Set up recorder terms for mimic annotations # Set up recorder terms for mimic annotations
env_cfg.env_name = args_cli.task env_cfg.env_name = args_cli.task
env_cfg.recorders: MimicRecorderManagerCfg = MimicRecorderManagerCfg() env_cfg.recorders: MimicRecorderManagerCfg = MimicRecorderManagerCfg()
if not args_cli.auto:
# disable subtask term signals recorder term if in manual mode
env_cfg.recorders.record_pre_step_subtask_term_signals = None
env_cfg.recorders.dataset_export_dir_path = output_dir env_cfg.recorders.dataset_export_dir_path = output_dir
env_cfg.recorders.dataset_filename = output_file_name env_cfg.recorders.dataset_filename = output_file_name
# create environment from loaded config # create environment from loaded config
env = gym.make(args_cli.task, cfg=env_cfg) env = gym.make(args_cli.task, cfg=env_cfg)
if not isinstance(env.unwrapped, ManagerBasedRLMimicEnv):
raise ValueError("The environment should be derived from ManagerBasedRLMimicEnv")
if args_cli.auto:
# check if the mimic API env.unwrapped.get_subtask_term_signals() is implemented
if env.unwrapped.get_subtask_term_signals.__func__ is ManagerBasedRLMimicEnv.get_subtask_term_signals:
raise NotImplementedError(
"The environment does not implement the get_subtask_term_signals method required "
"to run automatic annotations."
)
# reset environment # reset environment
env.reset() env.reset()
...@@ -219,13 +239,12 @@ def main(): ...@@ -219,13 +239,12 @@ def main():
f" to number of subtasks {len(args_cli.signals)}" f" to number of subtasks {len(args_cli.signals)}"
) )
annotated_episode = env.unwrapped.recorder_manager.get_episode(0) annotated_episode = env.unwrapped.recorder_manager.get_episode(0)
del annotated_episode.data["obs"]["subtask_term_signals"]
for subtask_index in range(len(args_cli.signals)): for subtask_index in range(len(args_cli.signals)):
# subtask termination signal is false until subtask is complete, and true afterwards # subtask termination signal is false until subtask is complete, and true afterwards
subtask_signals = torch.ones(len(actions), dtype=torch.bool) subtask_signals = torch.ones(len(actions), dtype=torch.bool)
subtask_signals[: subtask_indices[subtask_index]] = False subtask_signals[: subtask_indices[subtask_index]] = False
annotated_episode.add( annotated_episode.add(
f"obs/subtask_term_signals/{args_cli.signals[subtask_index]}", subtask_signals f"obs/datagen_info/subtask_term_signals/{args_cli.signals[subtask_index]}", subtask_signals
) )
# set success to the recorded episode data and export to file # set success to the recorded episode data and export to file
......
...@@ -85,6 +85,7 @@ from isaaclab_mimic.datagen.data_generator import DataGenerator ...@@ -85,6 +85,7 @@ from isaaclab_mimic.datagen.data_generator import DataGenerator
from isaaclab_mimic.datagen.datagen_info_pool import DataGenInfoPool from isaaclab_mimic.datagen.datagen_info_pool import DataGenInfoPool
from isaaclab.devices import Se3Keyboard, Se3SpaceMouse from isaaclab.devices import Se3Keyboard, Se3SpaceMouse
from isaaclab.envs import ManagerBasedRLMimicEnv
from isaaclab.envs.mdp.recorders.recorders_cfg import ActionStateRecorderManagerCfg from isaaclab.envs.mdp.recorders.recorders_cfg import ActionStateRecorderManagerCfg
from isaaclab.managers import DatasetExportMode, RecorderTerm, RecorderTermCfg from isaaclab.managers import DatasetExportMode, RecorderTerm, RecorderTermCfg
from isaaclab.utils import configclass from isaaclab.utils import configclass
...@@ -104,11 +105,16 @@ class PreStepDatagenInfoRecorder(RecorderTerm): ...@@ -104,11 +105,16 @@ class PreStepDatagenInfoRecorder(RecorderTerm):
"""Recorder term that records the datagen info data in each step.""" """Recorder term that records the datagen info data in each step."""
def record_pre_step(self): def record_pre_step(self):
eef_pose_dict = {}
for eef_name in self._env.cfg.subtask_configs.keys():
eef_pose_dict[eef_name] = self._env.get_robot_eef_pose(eef_name)
datagen_info = { datagen_info = {
"object_pose": self._env.scene.get_state(is_relative=True)["rigid_object"], "object_pose": self._env.get_object_poses(),
"target_eef_pose": self._env.action_to_target_eef_pos(self._env.action_manager.action), "eef_pose": eef_pose_dict,
"target_eef_pose": self._env.action_to_target_eef_pose(self._env.action_manager.action),
} }
return "obs", datagen_info return "obs/datagen_info", datagen_info
@configclass @configclass
...@@ -122,7 +128,7 @@ class PreStepSubtaskTermsObservationsRecorder(RecorderTerm): ...@@ -122,7 +128,7 @@ class PreStepSubtaskTermsObservationsRecorder(RecorderTerm):
"""Recorder term that records the subtask completion observations in each step.""" """Recorder term that records the subtask completion observations in each step."""
def record_pre_step(self): def record_pre_step(self):
return "obs/subtask_term_signals", self._env.obs_buf["subtask_terms"] return "obs/datagen_info/subtask_term_signals", self._env.get_subtask_term_signals()
@configclass @configclass
...@@ -397,6 +403,15 @@ def main(): ...@@ -397,6 +403,15 @@ def main():
# create environment # create environment
env = gym.make(env_name, cfg=env_cfg) env = gym.make(env_name, cfg=env_cfg)
if not isinstance(env.unwrapped, ManagerBasedRLMimicEnv):
raise ValueError("The environment should be derived from ManagerBasedRLMimicEnv")
# check if the mimic API env.unwrapped.get_subtask_term_signals() is implemented
if env.unwrapped.get_subtask_term_signals.__func__ is ManagerBasedRLMimicEnv.get_subtask_term_signals:
raise NotImplementedError(
"The environment does not implement the get_subtask_term_signals method required to run this script."
)
# set seed for generation # set seed for generation
random.seed(env.unwrapped.cfg.datagen_config.seed) random.seed(env.unwrapped.cfg.datagen_config.seed)
np.random.seed(env.unwrapped.cfg.datagen_config.seed) np.random.seed(env.unwrapped.cfg.datagen_config.seed)
......
...@@ -3,6 +3,9 @@ ...@@ -3,6 +3,9 @@
# #
# SPDX-License-Identifier: BSD-3-Clause # SPDX-License-Identifier: BSD-3-Clause
import torch
from collections.abc import Sequence
import isaaclab.utils.math as PoseUtils import isaaclab.utils.math as PoseUtils
from isaaclab.envs import ManagerBasedRLEnv from isaaclab.envs import ManagerBasedRLEnv
...@@ -30,79 +33,98 @@ class ManagerBasedRLMimicEnv(ManagerBasedRLEnv): ...@@ -30,79 +33,98 @@ class ManagerBasedRLMimicEnv(ManagerBasedRLEnv):
- Dataset Versatility: The synthetic data retains a quality that compares favorably with additional human demos. - Dataset Versatility: The synthetic data retains a quality that compares favorably with additional human demos.
""" """
def get_robot_eef_pose(self, env_ind=0): def get_robot_eef_pose(self, eef_name: str, env_ids: Sequence[int] | None = None) -> torch.Tensor:
""" """
Get current robot end effector pose. Should be the same frame as used by the robot end-effector controller. Get current robot end effector pose. Should be the same frame as used by the robot end-effector controller.
Args:
eef_name: Name of the end effector.
env_ids: Environment indices to get the pose for. If None, all envs are considered.
Returns: Returns:
pose (torch.Tensor): 4x4 eef pose matrix A torch.Tensor eef pose matrix. Shape is (len(env_ids), 4, 4)
""" """
raise NotImplementedError raise NotImplementedError
def target_eef_pose_to_action(self, target_eef_pose, relative=True, env_ind=0): def target_eef_pose_to_action(
self, target_eef_pose_dict: dict, gripper_action_dict: dict, noise: float | None = None, env_id: int = 0
) -> torch.Tensor:
""" """
Takes a target pose for the end effector controller and returns an action Takes a target pose and gripper action for the end effector controller and returns an action
to try and achieve that target pose. (usually a normalized delta pose action) to try and achieve that target pose.
Noise is added to the target pose action if specified.
Args: Args:
target_eef_pose (torch.Tensor): 4x4 target eef pose target_eef_pose_dict: Dictionary of 4x4 target eef pose for each end-effector.
relative (bool): if True, use relative pose actions, else absolute pose actions gripper_action_dict: Dictionary of gripper actions for each end-effector.
noise: Noise to add to the action. If None, no noise is added.
env_id: Environment index to compute the action for.
Returns: Returns:
action (torch.Tensor): action compatible with env.step (minus gripper actuation) An action torch.Tensor that's compatible with env.step().
""" """
raise NotImplementedError raise NotImplementedError
def action_to_target_eef_pos(self, action, relative=True, env_ind=0): def action_to_target_eef_pose(self, action: torch.Tensor) -> dict[str, torch.Tensor]:
""" """
Converts action (compatible with env.step) to a target pose for the end effector controller. Converts action (compatible with env.step) to a target pose for the end effector controller.
Inverse of @target_eef_pose_to_action. Usually used to infer a sequence of target controller poses Inverse of @target_eef_pose_to_action. Usually used to infer a sequence of target controller poses
from a demonstration trajectory using the recorded actions. from a demonstration trajectory using the recorded actions.
Args: Args:
action (torch.Tensor): environment action action: Environment action. Shape is (num_envs, action_dim).
relative (bool): if True, use relative pose actions, else absolute pose actions
Returns: Returns:
target_eef_pose (torch.Tensor): 4x4 target eef pose that @action corresponds to A dictionary of eef pose torch.Tensor that @action corresponds to.
""" """
raise NotImplementedError raise NotImplementedError
def action_to_gripper_action(self, action): def actions_to_gripper_actions(self, actions: torch.Tensor) -> dict[str, torch.Tensor]:
""" """
Extracts the gripper actuation part of an action (compatible with env.step). Extracts the gripper actuation part from a sequence of env actions (compatible with env.step).
Args: Args:
action (torch.Tensor): environment action actions: environment actions. The shape is (num_envs, num steps in a demo, action_dim).
Returns: Returns:
gripper_action (torch.Tensor): subset of environment action for gripper actuation A dictionary of torch.Tensor gripper actions. Key to each dict is an eef_name.
""" """
raise NotImplementedError raise NotImplementedError
def get_object_poses(self, env_ind=0): def get_object_poses(self, env_ids: Sequence[int] | None = None):
""" """
Gets the pose of each object relevant to Isaac Lab Mimic data generation in the current scene. Gets the pose of each object relevant to Isaac Lab Mimic data generation in the current scene.
Args:
env_ids: Environment indices to get the pose for. If None, all envs are considered.
Returns: Returns:
object_poses (dict): dictionary that maps object name (str) to object pose matrix (4x4 torch.Tensor) A dictionary that maps object names to object pose matrix (4x4 torch.Tensor)
""" """
if env_ids is None:
env_ids = slice(None)
rigid_object_states = self.scene.get_state(is_relative=True)["rigid_object"] rigid_object_states = self.scene.get_state(is_relative=True)["rigid_object"]
object_pose_matrix = dict() object_pose_matrix = dict()
for obj_name, obj_state in rigid_object_states.items(): for obj_name, obj_state in rigid_object_states.items():
object_pose_matrix[obj_name] = PoseUtils.make_pose( object_pose_matrix[obj_name] = PoseUtils.make_pose(
obj_state["root_pose"][env_ind, :3], PoseUtils.matrix_from_quat(obj_state["root_pose"][env_ind, 3:7]) obj_state["root_pose"][env_ids, :3], PoseUtils.matrix_from_quat(obj_state["root_pose"][env_ids, 3:7])
) )
return object_pose_matrix return object_pose_matrix
def get_subtask_term_signals(self, env_ind=0): def get_subtask_term_signals(self, env_ids: Sequence[int] | None = None) -> dict[str, torch.Tensor]:
""" """
Gets a dictionary of binary flags for each subtask in a task. The flag is 1 Gets a dictionary of termination signal flags for each subtask in a task. The flag is 1
when the subtask has been completed and 0 otherwise. Isaac Lab Mimic only uses this when the subtask has been completed and 0 otherwise. The implementation of this method is
when parsing source demonstrations at the start of data generation, and it only required if intending to enable automatic subtask term signal annotation when running the
uses the first 0 -> 1 transition in this signal to detect the end of a subtask. dataset annotation tool. This method can be kept unimplemented if intending to use manual
subtask term signal annotation.
Args:
env_ids: Environment indices to get the termination signals for. If None, all envs are considered.
Returns: Returns:
subtask_term_signals (dict): dictionary that maps subtask name to termination flag (0 or 1) A dictionary termination signal flags (False or True) for each subtask.
""" """
raise NotImplementedError raise NotImplementedError
......
...@@ -17,43 +17,135 @@ from isaaclab.utils import configclass ...@@ -17,43 +17,135 @@ from isaaclab.utils import configclass
class DataGenConfig: class DataGenConfig:
"""Configuration settings for data generation processes within the Isaac Lab Mimic environment.""" """Configuration settings for data generation processes within the Isaac Lab Mimic environment."""
name: str = "demo" # The name of the datageneration, default is "demo" # The name of the datageneration, default is "demo"
source_dataset_path: str = None # Path to the source dataset for mimic generation name: str = "demo"
generation_path: str = None # Path where the generated data will be saved
generation_guarantee: bool = False # Whether to guarantee generation of data (e.g., retry until successful) # If set to True, generation will be retried until
generation_keep_failed: bool = True # Whether to keep failed generation trials # generation_num_trials successful demos have been generated.
generation_num_trials: int = 10 # Number of trial to be generated # If set to False, generation will stop after generation_num_trails,
generation_select_src_per_subtask: bool = False # Whether to select source data per subtask # independent of whether they were all successful or not.
generation_transform_first_robot_pose: bool = False # Whether to transform the first robot pose during generation generation_guarantee: bool = True
generation_interpolate_from_last_target_pose: bool = True # Whether to interpolate from last target pose
task_name: str = None # Name of the task being configured ##############################################################
max_num_failures: int = 50 # Maximum number of failures allowed before stopping generation # Debugging parameters, which can help determining low success
num_demo_to_render: int = 50 # Number of demonstrations to render # rates.
num_fail_demo_to_render: int = 50 # Number of failed demonstrations to render
seed: int = 1 # Seed for randomization to ensure reproducibility # Whether to keep failed generation trials. Keeping failed
# demonstrations is useful for visualizing and debugging low
# success rates.
generation_keep_failed: bool = False
# Maximum number of failures allowed before stopping generation
max_num_failures: int = 50
# Seed for randomization to ensure reproducibility
seed: int = 1
##############################################################
# The following values can be changed on the command line, and
# only serve as defaults.
# Path to the source dataset for mimic generation
source_dataset_path: str = None
# Path where the generated data will be saved
generation_path: str = None
# Number of trial to be generated
generation_num_trials: int = 10
# Name of the task being configured
task_name: str = None
##############################################################
# Advanced configuration, does not usually need to be changed
# Whether to select source data per subtask
# Note: this requires subtasks to be properly temporally
# constrained, and may require additional subtasks to allow
# for time synchronization.
generation_select_src_per_subtask: bool = False
# Whether to transform the first robot pose during generation
generation_transform_first_robot_pose: bool = False
# Whether to interpolate from last target pose
generation_interpolate_from_last_target_pose: bool = True
@configclass @configclass
class SubTaskConfig: class SubTaskConfig:
"""Configuration settings specific to the management of individual subtasks.""" """
Configuration settings specific to the management of individual
subtasks.
"""
##############################################################
# Mandatory options that should be defined for every subtask
# Reference to the object involved in this subtask, None if no
# object is involved (this is rarely the case).
object_ref: str = None
# Signal for subtask termination
subtask_term_signal: str = None
##############################################################
# Advanced options for tuning the generation results
# Strategy on how to select a subtask segment. Can be either
# 'random', 'nearest_neighbor_object' or
# 'nearest_neighbor_robot_distance'. Details can be found in
# source/isaaclab_mimic/isaaclab_mimic/datagen/selection_strategy.py
#
# Note: for 'nearest_neighbor_object' and
# 'nearest_neighbor_robot_distance', the subtask needs to have
# 'object_ref' set to a value other than 'None' above. At the
# same time, if 'object_ref' is not 'None', then either of
# those strategies will usually yield higher success rates
# than the default 'random' strategy.
selection_strategy: str = "random"
# Additional arguments to the selected strategy. See details on
# each strategy in
# source/isaaclab_mimic/isaaclab_mimic/datagen/selection_strategy.py
# Arguments will be passed through to the `select_source_demo`
# method.
selection_strategy_kwargs: dict = {}
object_ref: str = None # Reference to the object involved in this subtask # Range for start offset of the first subtask
subtask_term_signal: str = None # Signal for subtask termination first_subtask_start_offset_range: tuple = (0, 0)
subtask_term_offset_range: tuple = (0, 0) # Range for offsetting subtask termination
selection_strategy: str = None # Strategy for selecting subtask # Range for offsetting subtask termination
selection_strategy_kwargs: dict = {} # Keyword arguments for the selection strategy subtask_term_offset_range: tuple = (0, 0)
action_noise: float = 0.03 # Amplitude of action noise applied
num_interpolation_steps: int = 5 # Number of steps for interpolation between waypoints # Amplitude of action noise applied
num_fixed_steps: int = 0 # Number of fixed steps for the subtask action_noise: float = 0.03
apply_noise_during_interpolation: bool = False # Whether to apply noise during interpolation
# Number of steps for interpolation between waypoints
num_interpolation_steps: int = 5
# Number of fixed steps for the subtask
num_fixed_steps: int = 0
# Whether to apply noise during interpolation
apply_noise_during_interpolation: bool = False
@configclass @configclass
class MimicEnvCfg: class MimicEnvCfg:
"""Configuration class for the Mimic environment integration. """
Configuration class for the Mimic environment integration.
This class consolidates various configuration aspects for the Isaac Lab Mimic data generation pipeline. This class consolidates various configuration aspects for the
Isaac Lab Mimic data generation pipeline.
""" """
datagen_config: DataGenConfig = DataGenConfig() # Configuration for the data generation # Overall configuration for the data generation
subtask_configs: list[SubTaskConfig] = [] # List of configurations for each subtask datagen_config: DataGenConfig = DataGenConfig()
# Dictionary of list of subtask configurations for each end-effector.
# Keys are end-effector names.
# Currently, only a single end-effector is supported by Isaac Lab Mimic
# so `subtask_configs` must always be of size 1.
subtask_configs: dict[str, list[SubTaskConfig]] = {}
...@@ -47,9 +47,14 @@ class DataGenerator: ...@@ -47,9 +47,14 @@ class DataGenerator:
assert isinstance(self.env_cfg, MimicEnvCfg) assert isinstance(self.env_cfg, MimicEnvCfg)
self.dataset_path = dataset_path self.dataset_path = dataset_path
if len(self.env_cfg.subtask_configs) != 1:
raise ValueError("Data generation currently supports only one end-effector.")
(self.eef_name,) = self.env_cfg.subtask_configs.keys()
(self.subtask_configs,) = self.env_cfg.subtask_configs.values()
# sanity check on task spec offset ranges - final subtask should not have any offset randomization # sanity check on task spec offset ranges - final subtask should not have any offset randomization
assert self.env_cfg.subtask_configs[-1].subtask_term_offset_range[0] == 0 assert self.subtask_configs[-1].subtask_term_offset_range[0] == 0
assert self.env_cfg.subtask_configs[-1].subtask_term_offset_range[1] == 0 assert self.subtask_configs[-1].subtask_term_offset_range[1] == 0
self.demo_keys = demo_keys self.demo_keys = demo_keys
...@@ -88,8 +93,8 @@ class DataGenerator: ...@@ -88,8 +93,8 @@ class DataGenerator:
# add them to subtask end indices, and then set them as the start indices of next subtask too # add them to subtask end indices, and then set them as the start indices of next subtask too
for i in range(src_subtask_indices.shape[1] - 1): for i in range(src_subtask_indices.shape[1] - 1):
end_offsets = np.random.randint( end_offsets = np.random.randint(
low=self.env_cfg.subtask_configs[i].subtask_term_offset_range[0], low=self.subtask_configs[i].subtask_term_offset_range[0],
high=self.env_cfg.subtask_configs[i].subtask_term_offset_range[1] + 1, high=self.subtask_configs[i].subtask_term_offset_range[1] + 1,
size=src_subtask_indices.shape[0], size=src_subtask_indices.shape[0],
) )
src_subtask_indices[:, i, 1] = src_subtask_indices[:, i, 1] + end_offsets src_subtask_indices[:, i, 1] = src_subtask_indices[:, i, 1] + end_offsets
...@@ -235,6 +240,8 @@ class DataGenerator: ...@@ -235,6 +240,8 @@ class DataGenerator:
src_demo_inds (list): list of selected source demonstration indices for each subtask src_demo_inds (list): list of selected source demonstration indices for each subtask
src_demo_labels (np.array): same as @src_demo_inds, but repeated to have a label for each timestep of the trajectory src_demo_labels (np.array): same as @src_demo_inds, but repeated to have a label for each timestep of the trajectory
""" """
eef_names = list(self.env_cfg.subtask_configs.keys())
eef_name = eef_names[0]
# reset the env to create a new task demo instance # reset the env to create a new task demo instance
env_id_tensor = torch.tensor([env_id], dtype=torch.int64, device=self.env.device) env_id_tensor = torch.tensor([env_id], dtype=torch.int64, device=self.env.device)
...@@ -257,17 +264,17 @@ class DataGenerator: ...@@ -257,17 +264,17 @@ class DataGenerator:
) # like @generated_src_demo_inds, but padded to align with size of @generated_actions ) # like @generated_src_demo_inds, but padded to align with size of @generated_actions
prev_src_demo_datagen_info_pool_size = 0 prev_src_demo_datagen_info_pool_size = 0
for subtask_ind in range(len(self.env_cfg.subtask_configs)): for subtask_ind in range(len(self.subtask_configs)):
# some things only happen on first subtask # some things only happen on first subtask
is_first_subtask = subtask_ind == 0 is_first_subtask = subtask_ind == 0
# name of object for this subtask # name of object for this subtask
subtask_object_name = self.env_cfg.subtask_configs[subtask_ind].object_ref subtask_object_name = self.subtask_configs[subtask_ind].object_ref
# corresponding current object pose # corresponding current object pose
cur_object_pose = ( cur_object_pose = (
self.env.get_object_poses(env_ind=env_id)[subtask_object_name] self.env.get_object_poses(env_ids=[env_id])[subtask_object_name][0]
if (subtask_object_name is not None) if (subtask_object_name is not None)
else None else None
) )
...@@ -288,13 +295,13 @@ class DataGenerator: ...@@ -288,13 +295,13 @@ class DataGenerator:
# Run source demo selection or use selected demo from previous iteration # Run source demo selection or use selected demo from previous iteration
if need_source_demo_selection: if need_source_demo_selection:
selected_src_demo_ind = self.select_source_demo( selected_src_demo_ind = self.select_source_demo(
eef_pose=self.env.get_robot_eef_pose(env_ind=env_id), eef_pose=self.env.get_robot_eef_pose(eef_name, env_ids=[env_id])[0],
object_pose=cur_object_pose, object_pose=cur_object_pose,
subtask_ind=subtask_ind, subtask_ind=subtask_ind,
src_subtask_inds=all_subtask_inds[:, subtask_ind], src_subtask_inds=all_subtask_inds[:, subtask_ind],
subtask_object_name=subtask_object_name, subtask_object_name=subtask_object_name,
selection_strategy_name=self.env_cfg.subtask_configs[subtask_ind].selection_strategy, selection_strategy_name=self.subtask_configs[subtask_ind].selection_strategy,
selection_strategy_kwargs=self.env_cfg.subtask_configs[subtask_ind].selection_strategy_kwargs, selection_strategy_kwargs=self.subtask_configs[subtask_ind].selection_strategy_kwargs,
) )
assert selected_src_demo_ind is not None assert selected_src_demo_ind is not None
...@@ -356,17 +363,19 @@ class DataGenerator: ...@@ -356,17 +363,19 @@ class DataGenerator:
else: else:
# Interpolation segment will start from current robot eef pose. # Interpolation segment will start from current robot eef pose.
init_sequence = WaypointSequence.from_poses( init_sequence = WaypointSequence.from_poses(
poses=self.env.get_robot_eef_pose(env_ind=env_id)[None], eef_names=eef_names,
poses=self.env.get_robot_eef_pose(eef_name, env_ids=[env_id])[0][None],
gripper_actions=src_subtask_gripper_actions[0:1], gripper_actions=src_subtask_gripper_actions[0:1],
action_noise=self.env_cfg.subtask_configs[subtask_ind].action_noise, action_noise=self.subtask_configs[subtask_ind].action_noise,
) )
traj_to_execute.add_waypoint_sequence(init_sequence) traj_to_execute.add_waypoint_sequence(init_sequence)
# Construct trajectory for the transformed segment. # Construct trajectory for the transformed segment.
transformed_seq = WaypointSequence.from_poses( transformed_seq = WaypointSequence.from_poses(
eef_names=eef_names,
poses=transformed_eef_poses, poses=transformed_eef_poses,
gripper_actions=src_subtask_gripper_actions, gripper_actions=src_subtask_gripper_actions,
action_noise=self.env_cfg.subtask_configs[subtask_ind].action_noise, action_noise=self.subtask_configs[subtask_ind].action_noise,
) )
transformed_traj = WaypointTrajectory() transformed_traj = WaypointTrajectory()
transformed_traj.add_waypoint_sequence(transformed_seq) transformed_traj.add_waypoint_sequence(transformed_seq)
...@@ -375,11 +384,12 @@ class DataGenerator: ...@@ -375,11 +384,12 @@ class DataGenerator:
# Interpolation will happen from the initial pose (@init_sequence) to the first element of @transformed_seq. # Interpolation will happen from the initial pose (@init_sequence) to the first element of @transformed_seq.
traj_to_execute.merge( traj_to_execute.merge(
transformed_traj, transformed_traj,
num_steps_interp=self.env_cfg.subtask_configs[subtask_ind].num_interpolation_steps, eef_names=eef_names,
num_steps_fixed=self.env_cfg.subtask_configs[subtask_ind].num_fixed_steps, num_steps_interp=self.subtask_configs[subtask_ind].num_interpolation_steps,
num_steps_fixed=self.subtask_configs[subtask_ind].num_fixed_steps,
action_noise=( action_noise=(
float(self.env_cfg.subtask_configs[subtask_ind].apply_noise_during_interpolation) float(self.subtask_configs[subtask_ind].apply_noise_during_interpolation)
* self.env_cfg.subtask_configs[subtask_ind].action_noise * self.subtask_configs[subtask_ind].action_noise
), ),
) )
......
...@@ -36,9 +36,13 @@ class DataGenInfoPool: ...@@ -36,9 +36,13 @@ class DataGenInfoPool:
self._asyncio_lock = asyncio_lock self._asyncio_lock = asyncio_lock
self.subtask_term_signals = [subtask_config.subtask_term_signal for subtask_config in env_cfg.subtask_configs] if len(env_cfg.subtask_configs) != 1:
raise ValueError("Data generation currently supports only one end-effector.")
(subtask_configs,) = env_cfg.subtask_configs.values()
self.subtask_term_signals = [subtask_config.subtask_term_signal for subtask_config in subtask_configs]
self.subtask_term_offset_ranges = [ self.subtask_term_offset_ranges = [
subtask_config.subtask_term_offset_range for subtask_config in env_cfg.subtask_configs subtask_config.subtask_term_offset_range for subtask_config in subtask_configs
] ]
@property @property
...@@ -82,20 +86,24 @@ class DataGenInfoPool: ...@@ -82,20 +86,24 @@ class DataGenInfoPool:
episode (EpisodeData): episode to add episode (EpisodeData): episode to add
""" """
ep_grp = episode.data ep_grp = episode.data
eef_name = list(self.env.cfg.subtask_configs.keys())[0]
# extract datagen info # extract datagen info
if "datagen_info" in ep_grp["obs"]:
eef_pose = ep_grp["obs"]["datagen_info"]["eef_pose"][eef_name]
object_poses_dict = ep_grp["obs"]["datagen_info"]["object_pose"]
target_eef_pose = ep_grp["obs"]["datagen_info"]["target_eef_pose"][eef_name]
subtask_term_signals_dict = ep_grp["obs"]["datagen_info"]["subtask_term_signals"]
else:
# Extract eef poses # Extract eef poses
eef_pos = ep_grp["obs"]["eef_pos"] eef_pos = ep_grp["obs"]["eef_pos"]
# format (w, x, y, z) eef_quat = ep_grp["obs"]["eef_quat"] # format (w, x, y, z)
eef_quat = ep_grp["obs"]["eef_quat"]
eef_rot_matrices = PoseUtils.matrix_from_quat(eef_quat) # shape (N, 3, 3) eef_rot_matrices = PoseUtils.matrix_from_quat(eef_quat) # shape (N, 3, 3)
# Create pose matrices for all environments # Create pose matrices for all environments
eef_pose = PoseUtils.make_pose(eef_pos, eef_rot_matrices) # shape (N, 4, 4) eef_pose = PoseUtils.make_pose(eef_pos, eef_rot_matrices) # shape (N, 4, 4)
# Object poses
object_poses_dict = dict() object_poses_dict = dict()
# TODO: change object_pose key in the dataset to object_state since it is not just the pose
for object_name, value in ep_grp["obs"]["object_pose"].items(): for object_name, value in ep_grp["obs"]["object_pose"].items():
# object_pose # object_pose
value = value["root_pose"] value = value["root_pose"]
...@@ -104,19 +112,23 @@ class DataGenInfoPool: ...@@ -104,19 +112,23 @@ class DataGenInfoPool:
# Convert to rotation matrices # Convert to rotation matrices
object_rot_matrices = PoseUtils.matrix_from_quat(value[:, 3:7]) # shape (N, 3, 3) object_rot_matrices = PoseUtils.matrix_from_quat(value[:, 3:7]) # shape (N, 3, 3)
object_rot_positions = value[:, 0:3] # shape (N, 3) object_rot_positions = value[:, 0:3] # shape (N, 3)
object_poses_dict[object_name] = PoseUtils.make_pose(object_rot_positions, object_rot_matrices) object_poses_dict[object_name] = PoseUtils.make_pose(object_rot_positions, object_rot_matrices)
# Target eef pose
target_eef_pose = ep_grp["obs"]["target_eef_pose"]
# Subtask termination signalsS
subtask_term_signals_dict = (ep_grp["obs"]["subtask_term_signals"],)
# Extract gripper actions # Extract gripper actions
gripper_actions = self.env.action_to_gripper_action(ep_grp["actions"]) gripper_actions = self.env.actions_to_gripper_actions(ep_grp["actions"])[eef_name]
ep_datagen_info_obj = DatagenInfo( ep_datagen_info_obj = DatagenInfo(
eef_pose=eef_pose, eef_pose=eef_pose,
object_poses=object_poses_dict, object_poses=object_poses_dict,
subtask_term_signals=ep_grp["obs"]["subtask_term_signals"], subtask_term_signals=subtask_term_signals_dict,
target_eef_pose=ep_grp["obs"]["target_eef_pose"], target_eef_pose=target_eef_pose,
gripper_action=gripper_actions, gripper_action=gripper_actions,
) )
self._datagen_infos.append(ep_datagen_info_obj) self._datagen_infos.append(ep_datagen_info_obj)
......
...@@ -18,7 +18,7 @@ class Waypoint: ...@@ -18,7 +18,7 @@ class Waypoint:
Represents a single desired 6-DoF waypoint, along with corresponding gripper actuation for this point. Represents a single desired 6-DoF waypoint, along with corresponding gripper actuation for this point.
""" """
def __init__(self, pose, gripper_action, noise=None): def __init__(self, eef_names, pose, gripper_action, noise=None):
""" """
Args: Args:
pose (torch.Tensor): 4x4 pose target for robot controller pose (torch.Tensor): 4x4 pose target for robot controller
...@@ -26,10 +26,10 @@ class Waypoint: ...@@ -26,10 +26,10 @@ class Waypoint:
noise (float or None): action noise amplitude to apply during execution at this timestep noise (float or None): action noise amplitude to apply during execution at this timestep
(for arm actions, not gripper actions) (for arm actions, not gripper actions)
""" """
self.eef_names = eef_names
self.pose = pose self.pose = pose
self.gripper_action = gripper_action self.gripper_action = gripper_action
self.noise = noise self.noise = noise
assert len(self.gripper_action.shape) == 1
def __str__(self): def __str__(self):
"""String representation of the waypoint.""" """String representation of the waypoint."""
...@@ -54,7 +54,7 @@ class WaypointSequence: ...@@ -54,7 +54,7 @@ class WaypointSequence:
self.sequence = deepcopy(sequence) self.sequence = deepcopy(sequence)
@classmethod @classmethod
def from_poses(cls, poses, gripper_actions, action_noise): def from_poses(cls, eef_names, poses, gripper_actions, action_noise):
""" """
Instantiate a WaypointSequence object given a sequence of poses, Instantiate a WaypointSequence object given a sequence of poses,
gripper actions, and action noise. gripper actions, and action noise.
...@@ -79,6 +79,7 @@ class WaypointSequence: ...@@ -79,6 +79,7 @@ class WaypointSequence:
# make WaypointSequence instance # make WaypointSequence instance
sequence = [ sequence = [
Waypoint( Waypoint(
eef_names=eef_names,
pose=poses[t], pose=poses[t],
gripper_action=gripper_actions[t], gripper_action=gripper_actions[t],
noise=action_noise[t, 0], noise=action_noise[t, 0],
...@@ -201,6 +202,7 @@ class WaypointTrajectory: ...@@ -201,6 +202,7 @@ class WaypointTrajectory:
def add_waypoint_sequence_for_target_pose( def add_waypoint_sequence_for_target_pose(
self, self,
eef_names,
pose, pose,
gripper_action, gripper_action,
num_steps, num_steps,
...@@ -252,6 +254,7 @@ class WaypointTrajectory: ...@@ -252,6 +254,7 @@ class WaypointTrajectory:
# add waypoint sequence for this set of poses # add waypoint sequence for this set of poses
sequence = WaypointSequence.from_poses( sequence = WaypointSequence.from_poses(
eef_names=eef_names,
poses=poses, poses=poses,
gripper_actions=gripper_actions, gripper_actions=gripper_actions,
action_noise=action_noise, action_noise=action_noise,
...@@ -278,6 +281,7 @@ class WaypointTrajectory: ...@@ -278,6 +281,7 @@ class WaypointTrajectory:
def merge( def merge(
self, self,
other, other,
eef_names,
num_steps_interp=None, num_steps_interp=None,
num_steps_fixed=None, num_steps_fixed=None,
action_noise=0.0, action_noise=0.0,
...@@ -311,6 +315,7 @@ class WaypointTrajectory: ...@@ -311,6 +315,7 @@ class WaypointTrajectory:
if need_interp: if need_interp:
# interpolation segment # interpolation segment
self.add_waypoint_sequence_for_target_pose( self.add_waypoint_sequence_for_target_pose(
eef_names=eef_names,
pose=target_for_interpolation.pose, pose=target_for_interpolation.pose,
gripper_action=target_for_interpolation.gripper_action, gripper_action=target_for_interpolation.gripper_action,
num_steps=num_steps_interp, num_steps=num_steps_interp,
...@@ -324,6 +329,7 @@ class WaypointTrajectory: ...@@ -324,6 +329,7 @@ class WaypointTrajectory:
# account for the fact that we pop'd the first element of @other in anticipation of an interpolation segment # account for the fact that we pop'd the first element of @other in anticipation of an interpolation segment
num_steps_fixed_to_use = num_steps_fixed if need_interp else (num_steps_fixed + 1) num_steps_fixed_to_use = num_steps_fixed if need_interp else (num_steps_fixed + 1)
self.add_waypoint_sequence_for_target_pose( self.add_waypoint_sequence_for_target_pose(
eef_names=eef_names,
pose=target_for_interpolation.pose, pose=target_for_interpolation.pose,
gripper_action=target_for_interpolation.gripper_action, gripper_action=target_for_interpolation.gripper_action,
num_steps=num_steps_fixed_to_use, num_steps=num_steps_fixed_to_use,
...@@ -382,17 +388,15 @@ class WaypointTrajectory: ...@@ -382,17 +388,15 @@ class WaypointTrajectory:
obs = env.obs_buf obs = env.obs_buf
state = env.scene.get_state(is_relative=True) state = env.scene.get_state(is_relative=True)
# convert target pose to arm action # convert target pose and gripper action to env action
action_pose = env.target_eef_pose_to_action(target_eef_pose=waypoint.pose, env_ind=env_id) target_eef_pose_dict = {waypoint.eef_names[0]: waypoint.pose}
gripper_action_dict = {waypoint.eef_names[0]: waypoint.gripper_action}
# maybe add noise to action using torch.randn play_action = env.target_eef_pose_to_action(
if waypoint.noise is not None: target_eef_pose_dict=target_eef_pose_dict,
noise = waypoint.noise * torch.randn_like(action_pose) gripper_action_dict=gripper_action_dict,
action_pose += noise noise=waypoint.noise,
action_pose = torch.clamp(action_pose, -1.0, 1.0) env_id=env_id,
)
# add in gripper action
play_action = torch.cat([action_pose, waypoint.gripper_action], dim=0)
# step environment # step environment
if not isinstance(play_action, torch.Tensor): if not isinstance(play_action, torch.Tensor):
......
...@@ -4,6 +4,7 @@ ...@@ -4,6 +4,7 @@
# SPDX-License-Identifier: Apache-2.0 # SPDX-License-Identifier: Apache-2.0
import torch import torch
from collections.abc import Sequence
import isaaclab.utils.math as PoseUtils import isaaclab.utils.math as PoseUtils
from isaaclab.envs import ManagerBasedRLMimicEnv from isaaclab.envs import ManagerBasedRLMimicEnv
...@@ -14,79 +15,92 @@ class FrankaCubeStackIKRelMimicEnv(ManagerBasedRLMimicEnv): ...@@ -14,79 +15,92 @@ class FrankaCubeStackIKRelMimicEnv(ManagerBasedRLMimicEnv):
Isaac Lab Mimic environment wrapper class for Franka Cube Stack IK Rel env. Isaac Lab Mimic environment wrapper class for Franka Cube Stack IK Rel env.
""" """
def get_robot_eef_pose(self, env_ind=0): def get_robot_eef_pose(self, eef_name: str, env_ids: Sequence[int] | None = None) -> torch.Tensor:
""" """
Get current robot end effector pose. Should be the same frame as used by the robot end-effector controller. Get current robot end effector pose. Should be the same frame as used by the robot end-effector controller.
Args:
eef_name: Name of the end effector.
env_ids: Environment indices to get the pose for. If None, all envs are considered.
Returns: Returns:
pose (torch.Tensor): 4x4 eef pose matrix A torch.Tensor eef pose matrix. Shape is (len(env_ids), 4, 4)
""" """
if env_ids is None:
env_ids = slice(None)
# Retrieve end effector pose from the observation buffer # Retrieve end effector pose from the observation buffer
eef_pos = self.obs_buf["policy"]["eef_pos"][env_ind] eef_pos = self.obs_buf["policy"]["eef_pos"][env_ids]
eef_quat = self.obs_buf["policy"]["eef_quat"][env_ind] eef_quat = self.obs_buf["policy"]["eef_quat"][env_ids]
# Quaternion format is w,x,y,z # Quaternion format is w,x,y,z
return PoseUtils.make_pose(eef_pos, PoseUtils.matrix_from_quat(eef_quat)) return PoseUtils.make_pose(eef_pos, PoseUtils.matrix_from_quat(eef_quat))
def target_eef_pose_to_action(self, target_eef_pose, relative=True, env_ind=0): def target_eef_pose_to_action(
self, target_eef_pose_dict: dict, gripper_action_dict: dict, noise: float | None = None, env_id: int = 0
) -> torch.Tensor:
""" """
Takes a target pose for the end effector controller and returns an action Takes a target pose and gripper action for the end effector controller and returns an action
(usually a normalized delta pose action) to try and achieve that target pose. (usually a normalized delta pose action) to try and achieve that target pose.
Noise is added to the target pose action if specified.
Args: Args:
target_eef_pose (torch.Tensor): 4x4 target eef pose target_eef_pose_dict: Dictionary of 4x4 target eef pose for each end-effector.
relative (bool): if True, use relative pose actions, else absolute pose actions gripper_action_dict: Dictionary of gripper actions for each end-effector.
noise: Noise to add to the action. If None, no noise is added.
env_id: Environment index to get the action for.
Returns: Returns:
action (torch.Tensor): action compatible with env.step (minus gripper actuation) An action torch.Tensor that's compatible with env.step().
""" """
eef_name = list(self.cfg.subtask_configs.keys())[0]
# target position and rotation # target position and rotation
(target_eef_pose,) = target_eef_pose_dict.values()
target_pos, target_rot = PoseUtils.unmake_pose(target_eef_pose) target_pos, target_rot = PoseUtils.unmake_pose(target_eef_pose)
# current position and rotation # current position and rotation
curr_pose = self.get_robot_eef_pose(env_ind=env_ind) curr_pose = self.get_robot_eef_pose(eef_name, env_ids=[env_id])[0]
curr_pos, curr_rot = PoseUtils.unmake_pose(curr_pose) curr_pos, curr_rot = PoseUtils.unmake_pose(curr_pose)
if relative:
# normalized delta position action # normalized delta position action
delta_position = target_pos - curr_pos delta_position = target_pos - curr_pos
# delta_position = np.clip(delta_position / max_dpos, -1., 1.)
# normalized delta rotation action # normalized delta rotation action
delta_rot_mat = target_rot.matmul(curr_rot.transpose(-1, -2)) delta_rot_mat = target_rot.matmul(curr_rot.transpose(-1, -2))
delta_quat = PoseUtils.quat_from_matrix(delta_rot_mat) delta_quat = PoseUtils.quat_from_matrix(delta_rot_mat)
delta_rotation = PoseUtils.axis_angle_from_quat(delta_quat) delta_rotation = PoseUtils.axis_angle_from_quat(delta_quat)
# delta_rotation = np.clip(delta_rotation / max_drot, -1., 1.) # get gripper action for single eef
return torch.cat([delta_position, delta_rotation], dim=0) (gripper_action,) = gripper_action_dict.values()
else:
raise NotImplementedError("Absolute pose actions are not implemented.") # add noise to action
return pose_action = torch.cat([delta_position, delta_rotation], dim=0)
if noise is not None:
noise = noise * torch.randn_like(pose_action)
pose_action += noise
pose_action = torch.clamp(pose_action, -1.0, 1.0)
return torch.cat([pose_action, gripper_action], dim=0)
def action_to_target_eef_pos(self, action, relative=True, env_ind=0): def action_to_target_eef_pose(self, action: torch.Tensor) -> dict[str, torch.Tensor]:
""" """
Converts action (compatible with env.step) to a target pose for the end effector controller. Converts action (compatible with env.step) to a target pose for the end effector controller.
Inverse of @target_eef_pose_to_action. Usually used to infer a sequence of target controller poses Inverse of @target_eef_pose_to_action. Usually used to infer a sequence of target controller poses
from a demonstration trajectory using the recorded actions. from a demonstration trajectory using the recorded actions.
Args: Args:
action (torch.Tensor): environment action action: Environment action. Shape is (num_envs, action_dim)
relative (bool): if True, use relative pose actions, else absolute pose actions
Returns: Returns:
target_eef_pose (torch.Tensor): 4x4 target eef pose that @action corresponds to A dictionary of eef pose torch.Tensor that @action corresponds to
""" """
eef_name = list(self.cfg.subtask_configs.keys())[0]
target_poses = [] delta_position = action[:, :3]
delta_rotation = action[:, 3:6]
for env_ind in range(self.scene.num_envs):
delta_position = action[env_ind][:3]
delta_rotation = action[env_ind][3:6]
# current position and rotation # current position and rotation
curr_pose = self.get_robot_eef_pose(env_ind=env_ind) curr_pose = self.get_robot_eef_pose(eef_name, env_ids=None)
curr_pos, curr_rot = PoseUtils.unmake_pose(curr_pose) curr_pos, curr_rot = PoseUtils.unmake_pose(curr_pose)
# get pose target # get pose target
...@@ -94,51 +108,54 @@ class FrankaCubeStackIKRelMimicEnv(ManagerBasedRLMimicEnv): ...@@ -94,51 +108,54 @@ class FrankaCubeStackIKRelMimicEnv(ManagerBasedRLMimicEnv):
# Convert delta_rotation to axis angle form # Convert delta_rotation to axis angle form
delta_rotation_angle = torch.linalg.norm(delta_rotation, dim=-1, keepdim=True) delta_rotation_angle = torch.linalg.norm(delta_rotation, dim=-1, keepdim=True)
# make sure that axis is a unit vector
# Check for invalid division
if torch.isclose(delta_rotation_angle, torch.tensor([0.0], device=delta_rotation_angle.device)):
# Quaternion format is wxyz
delta_quat = torch.tensor([1.0, 0.0, 0.0, 0.0], device=delta_rotation_angle.device)
else:
delta_rotation_axis = delta_rotation / delta_rotation_angle delta_rotation_axis = delta_rotation / delta_rotation_angle
delta_quat = PoseUtils.quat_from_angle_axis(delta_rotation_angle, delta_rotation_axis).squeeze(0)
delta_rot_mat = PoseUtils.matrix_from_quat(delta_quat)
# Handle invalid division for the case when delta_rotation_angle is close to zero
is_close_to_zero_angle = torch.isclose(delta_rotation_angle, torch.zeros_like(delta_rotation_angle)).squeeze(1)
delta_rotation_axis[is_close_to_zero_angle] = torch.zeros_like(delta_rotation_axis)[is_close_to_zero_angle]
delta_quat = PoseUtils.quat_from_angle_axis(delta_rotation_angle.squeeze(1), delta_rotation_axis).squeeze(0)
delta_rot_mat = PoseUtils.matrix_from_quat(delta_quat)
target_rot = torch.matmul(delta_rot_mat, curr_rot) target_rot = torch.matmul(delta_rot_mat, curr_rot)
target_pose = PoseUtils.make_pose(target_pos, target_rot).clone() target_poses = PoseUtils.make_pose(target_pos, target_rot).clone()
target_poses.append(target_pose) return {eef_name: target_poses}
return target_poses
def action_to_gripper_action(self, action): def actions_to_gripper_actions(self, actions: torch.Tensor) -> dict[str, torch.Tensor]:
""" """
Extracts the gripper actuation part of an action (compatible with env.step). Extracts the gripper actuation part from a sequence of env actions (compatible with env.step).
Args: Args:
action (torch.Tensor): environment action of shape N x action_dim. Where N is number of steps in a demo actions: environment actions. The shape is (num_envs, num steps in a demo, action_dim).
Returns: Returns:
gripper_action (torch.Tensor): subset of environment action for gripper actuation of shape N x gripper_action_dim A dictionary of torch.Tensor gripper actions. Key to each dict is an eef_name.
""" """
# last dimension is gripper action # last dimension is gripper action
return action[:, -1:] return {list(self.cfg.subtask_configs.keys())[0]: actions[:, -1:]}
def get_subtask_term_signals(self, env_ind=0): def get_subtask_term_signals(self, env_ids: Sequence[int] | None = None) -> dict[str, torch.Tensor]:
""" """
Gets a dictionary of binary flags for each subtask in a task. The flag is 1 Gets a dictionary of termination signal flags for each subtask in a task. The flag is 1
when the subtask has been completed and 0 otherwise. Isaac Lab Mimic only uses this when the subtask has been completed and 0 otherwise. The implementation of this method is
when parsing source demonstrations at the start of data generation, and it only required if intending to enable automatic subtask term signal annotation when running the
uses the first 0 -> 1 transition in this signal to detect the end of a subtask. dataset annotation tool. This method can be kept unimplemented if intending to use manual
subtask term signal annotation.
Args:
env_ids: Environment indices to get the termination signals for. If None, all envs are considered.
Returns: Returns:
subtask_term_signals (dict): dictionary that maps subtask name to termination flag (0 or 1) A dictionary termination signal flags (False or True) for each subtask.
""" """
signals = dict() if env_ids is None:
env_ids = slice(None)
signals = dict()
subtask_terms = self.obs_buf["subtask_terms"] subtask_terms = self.obs_buf["subtask_terms"]
signals["grasp_1"] = subtask_terms["grasp_1"][env_ind] signals["grasp_1"] = subtask_terms["grasp_1"][env_ids]
signals["grasp_2"] = subtask_terms["grasp_2"][env_ind] signals["grasp_2"] = subtask_terms["grasp_2"][env_ids]
signals["stack_1"] = subtask_terms["stack_1"][env_ind] signals["stack_1"] = subtask_terms["stack_1"][env_ids]
# final subtask is placing cubeC on cubeA (motion relative to cubeA) - but final subtask signal is not needed # final subtask is placing cubeC on cubeA (motion relative to cubeA) - but final subtask signal is not needed
return signals return signals
...@@ -31,12 +31,11 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg): ...@@ -31,12 +31,11 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg):
self.datagen_config.generation_transform_first_robot_pose = False self.datagen_config.generation_transform_first_robot_pose = False
self.datagen_config.generation_interpolate_from_last_target_pose = True self.datagen_config.generation_interpolate_from_last_target_pose = True
self.datagen_config.max_num_failures = 25 self.datagen_config.max_num_failures = 25
self.datagen_config.num_demo_to_render = 10
self.datagen_config.num_fail_demo_to_render = 25
self.datagen_config.seed = 1 self.datagen_config.seed = 1
# The following are the subtask configurations for the stack task. # The following are the subtask configurations for the stack task.
self.subtask_configs.append( subtask_configs = []
subtask_configs.append(
SubTaskConfig( SubTaskConfig(
# Each subtask involves manipulation with respect to a single object frame. # Each subtask involves manipulation with respect to a single object frame.
object_ref="cube_2", object_ref="cube_2",
...@@ -60,7 +59,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg): ...@@ -60,7 +59,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg):
apply_noise_during_interpolation=False, apply_noise_during_interpolation=False,
) )
) )
self.subtask_configs.append( subtask_configs.append(
SubTaskConfig( SubTaskConfig(
# Each subtask involves manipulation with respect to a single object frame. # Each subtask involves manipulation with respect to a single object frame.
object_ref="cube_1", object_ref="cube_1",
...@@ -82,7 +81,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg): ...@@ -82,7 +81,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg):
apply_noise_during_interpolation=False, apply_noise_during_interpolation=False,
) )
) )
self.subtask_configs.append( subtask_configs.append(
SubTaskConfig( SubTaskConfig(
# Each subtask involves manipulation with respect to a single object frame. # Each subtask involves manipulation with respect to a single object frame.
object_ref="cube_3", object_ref="cube_3",
...@@ -104,7 +103,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg): ...@@ -104,7 +103,7 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg):
apply_noise_during_interpolation=False, apply_noise_during_interpolation=False,
) )
) )
self.subtask_configs.append( subtask_configs.append(
SubTaskConfig( SubTaskConfig(
# Each subtask involves manipulation with respect to a single object frame. # Each subtask involves manipulation with respect to a single object frame.
object_ref="cube_2", object_ref="cube_2",
...@@ -126,3 +125,4 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg): ...@@ -126,3 +125,4 @@ class FrankaCubeStackIKRelMimicEnvCfg(FrankaCubeStackEnvCfg, MimicEnvCfg):
apply_noise_during_interpolation=False, apply_noise_during_interpolation=False,
) )
) )
self.subtask_configs["franka"] = subtask_configs
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment