Unverified Commit de76c2e9 authored by Nicola Loi's avatar Nicola Loi Committed by GitHub

Fixes action reset of pre_trained_policy_action (#1623)

# Description

Currently, the
[PreTrainedPolicyAction](https://github.com/isaac-sim/IsaacLab/blob/v1.4.0/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/navigation/mdp/pre_trained_policy_action.py#L24)
class does not reset the actions in the low-level observations when a
new episode starts.

In my custom legged robot navigation task, the behavior was correct only
during the first training episode but failed from the second episode
onward. At the start of a new episode, the action observations are not
reset and retain the last actions from the previous episode. This can
impact training, as in my case, where the actions at the end of an
episode differ significantly from those required at the beginning of an
episode.

This PR resolves the issue by resetting the low-level action
observations at the beginning of each new episode.


## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)


## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
parent 43a3ce9a
......@@ -66,6 +66,7 @@ Guidelines for modifications:
* Michael Gussert
* Michael Noseworthy
* Muhong Guo
* Nicola Loi
* Nuralem Abizov
* Oyindamola Omotuyi
* Özhan Özen
......
[package]
# Note: Semantic Versioning is used: https://semver.org/
version = "0.10.17"
version = "0.10.18"
# Description
title = "Isaac Lab Environments"
......
Changelog
---------
0.10.18 (2025-01-03)
~~~~~~~~~~~~~~~~~~~
Fixed
^^^^^
* Fixed the reset of the actions in the function overriding of the low level observations of :class:`omni.isaac.lab_tasks.manager_based.navigation.mdp.PreTrainedPolicyAction`.
0.10.17 (2024-12-17)
~~~~~~~~~~~~~~~~~~~~
......
......@@ -50,8 +50,14 @@ class PreTrainedPolicyAction(ActionTerm):
self._low_level_action_term: ActionTerm = cfg.low_level_actions.class_type(cfg.low_level_actions, env)
self.low_level_actions = torch.zeros(self.num_envs, self._low_level_action_term.action_dim, device=self.device)
def last_action():
# reset the low level actions if the episode was reset
if hasattr(env, "episode_length_buf"):
self.low_level_actions[env.episode_length_buf == 0, :] = 0
return self.low_level_actions
# remap some of the low level observations to internal observations
cfg.low_level_observations.actions.func = lambda dummy_env: self.low_level_actions
cfg.low_level_observations.actions.func = lambda dummy_env: last_action()
cfg.low_level_observations.actions.params = dict()
cfg.low_level_observations.velocity_commands.func = lambda dummy_env: self._raw_actions
cfg.low_level_observations.velocity_commands.params = dict()
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment