• ooctipus's avatar
    Adds new curriculum mdp that allows modification on any environment parameters (#2777) · cee5027b
    ooctipus authored
    # Description
    
    This PR created two curriculum mdp that can change any parameter in env
    instance.
    namely `modify_term_cfg` and `modify_env_param`.
    
    `modify_env_param` is a more general version that can override any value
    belongs to env, but requires user to know the full path to the value.
    
    `modify_term_cfg` only work with manager_term, but is a more user
    friendly version that simplify path specification, for example, instead
    of write "observation_manager.cfg.policy.joint_pos.noise", you instead
    write "observations.policy.joint_pos.noise", consistent with hydra
    overriding style
    
    Besides path to value is needed, modify_fn, modify_params is also needed
    for telling the term how to modify.
    
    
    
    Demo 1: difficulty-adaptive modification for all python native data type
    ```
    # iv -> initial value, fv -> final value
    def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction):
        iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device)
        fraction = eval(get_fraction)
        new_val = fraction * (fv_ - iv_) + iv_
        if isinstance(data, float):
            return new_val.item()
        elif isinstance(data, int):
            return int(new_val.item())
        elif isinstance(data, (tuple, list)):
            raw = new_val.tolist()
            # assume iv is sequence of all ints or all floats:
            is_int = isinstance(iv[0], int)
            casted = [int(x) if is_int else float(x) for x in raw]
            return tuple(casted) if isinstance(data, tuple) else casted
        else:
            raise TypeError(f"Does not support the type {type(data)}")
    ```
    (float)
    ```
        joint_pos_unoise_min_adr = CurrTerm(
            func=mdp.modify_term_cfg,
            params={
                "address": "observations.policy.joint_pos.noise.n_min",
                "modify_fn": initial_final_interpolate_fn,
                "modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"}
            }
        )
    ```
    
    (tuple or list)
    ```
    command_object_pose_xrange_adr = CurrTerm(
            func=mdp.modify_term_cfg,
            params={
                "address": "commands.object_pose.ranges.pos_x",
                "modify_fn": initial_final_interpolate_fn,
                "modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"}
            }
        )
    ```
    
    Demo 3: overriding entire term on env_step counter rather than adaptive
    ```
    def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps):
        if env.common_step_counter > num_steps:
            return new_val
        return mdp.modify_term_cfg.NO_CHANGE
    
    object_pos_curriculum = CurrTerm(
            func=mdp.modify_term_cfg,
            params={
                "address": "commands.object_pose",
                "modify_fn": value_override,
                "modify_params": {"new_val": <new_observation_term>, "num_step": 120000 }
            }
        )
    ```
    
    Demo 4: overriding Tensor field within some arbitary class not visible
    from term_cfg
    (you can see that 'address' is not as nice as mdp.modify_term_cfg)
    ```
    def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps):
        if env.common_step_counter > num_steps:
              range_list = [static_friction_range, dynamic_friction_range, restitution_range]
              ranges = torch.tensor(range_list, device="cpu")
              new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu")
              return new_buckets
        return mdp.modify_env_param.NO_CHANGE
    
    object_physics_material_curriculum = CurrTerm(
            func=mdp.modify_env_param,
            params={
                "address": "event_manager.cfg.object_physics_material.func.material_buckets",
                "modify_fn": resample_bucket_range,
                "modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 }
            }
        )
    ```
    
    
    ## Type of change
    
    <!-- As you go through the list, delete the ones that are not
    applicable. -->
    
    - New feature (non-breaking change which adds functionality)
    
    
    ## Checklist
    
    - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
    `./isaaclab.sh --format`
    - [ ] I have made corresponding changes to the documentation
    - [x] My changes generate no new warnings
    - [x] I have added tests that prove my fix is effective or that my
    feature works
    - [x] I have updated the changelog and the corresponding version in the
    extension's `config/extension.toml` file
    - [x] I have added my name to the `CONTRIBUTORS.md` or my name already
    exists there
    
    <!--
    As you go through the checklist above, you can mark something as done by
    putting an x character in it
    
    For example,
    - [x] I have done this task
    - [ ] I have not done this task
    -->
    
    ---------
    Signed-off-by: 's avatarooctipus <zhengyuz@nvidia.com>
    Signed-off-by: 's avatarKelly Guo <kellyg@nvidia.com>
    Co-authored-by: 's avatarKelly Guo <kellyg@nvidia.com>
    cee5027b
Name
Last commit
Last update
..
config Loading commit data...
docs Loading commit data...
isaaclab Loading commit data...
test Loading commit data...
pyproject.toml Loading commit data...
setup.py Loading commit data...