Adds new curriculum mdp that allows modification on any environment parameters (#2777)
# Description
This PR created two curriculum mdp that can change any parameter in env
instance.
namely `modify_term_cfg` and `modify_env_param`.
`modify_env_param` is a more general version that can override any value
belongs to env, but requires user to know the full path to the value.
`modify_term_cfg` only work with manager_term, but is a more user
friendly version that simplify path specification, for example, instead
of write "observation_manager.cfg.policy.joint_pos.noise", you instead
write "observations.policy.joint_pos.noise", consistent with hydra
overriding style
Besides path to value is needed, modify_fn, modify_params is also needed
for telling the term how to modify.
Demo 1: difficulty-adaptive modification for all python native data type
```
# iv -> initial value, fv -> final value
def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction):
iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device)
fraction = eval(get_fraction)
new_val = fraction * (fv_ - iv_) + iv_
if isinstance(data, float):
return new_val.item()
elif isinstance(data, int):
return int(new_val.item())
elif isinstance(data, (tuple, list)):
raw = new_val.tolist()
# assume iv is sequence of all ints or all floats:
is_int = isinstance(iv[0], int)
casted = [int(x) if is_int else float(x) for x in raw]
return tuple(casted) if isinstance(data, tuple) else casted
else:
raise TypeError(f"Does not support the type {type(data)}")
```
(float)
```
joint_pos_unoise_min_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "observations.policy.joint_pos.noise.n_min",
"modify_fn": initial_final_interpolate_fn,
"modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"}
}
)
```
(tuple or list)
```
command_object_pose_xrange_adr = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "commands.object_pose.ranges.pos_x",
"modify_fn": initial_final_interpolate_fn,
"modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"}
}
)
```
Demo 3: overriding entire term on env_step counter rather than adaptive
```
def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps):
if env.common_step_counter > num_steps:
return new_val
return mdp.modify_term_cfg.NO_CHANGE
object_pos_curriculum = CurrTerm(
func=mdp.modify_term_cfg,
params={
"address": "commands.object_pose",
"modify_fn": value_override,
"modify_params": {"new_val": <new_observation_term>, "num_step": 120000 }
}
)
```
Demo 4: overriding Tensor field within some arbitary class not visible
from term_cfg
(you can see that 'address' is not as nice as mdp.modify_term_cfg)
```
def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps):
if env.common_step_counter > num_steps:
range_list = [static_friction_range, dynamic_friction_range, restitution_range]
ranges = torch.tensor(range_list, device="cpu")
new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu")
return new_buckets
return mdp.modify_env_param.NO_CHANGE
object_physics_material_curriculum = CurrTerm(
func=mdp.modify_env_param,
params={
"address": "event_manager.cfg.object_physics_material.func.material_buckets",
"modify_fn": resample_bucket_range,
"modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 }
}
)
```
## Type of change
<!-- As you go through the list, delete the ones that are not
applicable. -->
- New feature (non-breaking change which adds functionality)
## Checklist
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it
For example,
- [x] I have done this task
- [ ] I have not done this task
-->
---------
Signed-off-by:
ooctipus <zhengyuz@nvidia.com>
Signed-off-by:
Kelly Guo <kellyg@nvidia.com>
Co-authored-by:
Kelly Guo <kellyg@nvidia.com>
Showing
Please register or sign in to comment