Fixes Termination Manager logging to report aggregated percentage of...

Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. (#3107) # Description Currently Termination Manager write current step's done count for each term if reset is detected. This leads to two problem. 1. User sees different counts just by varying num_envs 2. the count can be over-count or under-count depending on when reset happens, as pointed out in #2977 (Thanks, @Kyu3224) The cause of the bug is because we are reporting current step status into a buffer that suppose to record episodic done. So instead of write the entire buffer base on current value, we ask the update to respect the non-reseting environment's old value, and instead of reporting count, we report percentage of environment that was done due to the particular term. Test on Isaac-Velocity-Rough-Anymal-C-v0 Before fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-20" src="https://github.com/user-attachments/assets/4838d612-7f0e-4232-a07e-688b547e91db" /> Red: num_envs = 4096, Orange: num_envs = 1024 After fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-12" src="https://github.com/user-attachments/assets/e6e55c21-17ed-42ca-8d94-a19d08611f86" /> Red: num_envs = 4096, Orange: num_envs = 1024 Note that curve of the same color ran on same seed, and curves matched exactly, the only difference is the data gets reported in termination. The percentage version is a lot more clear in conveying how agent currently fails, and how much percentage of agent fails, and shows that increasing num_envs to 4096 helps improve agent avoiding termination by `base_contact` much quicker than num_envs=1024. Such message is a bit hard to tell in first image.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

Fixes Termination Manager logging to report aggregated percentage of...
Fixes Termination Manager logging to report aggregated percentage of environments done due to each term. (#3107) # Description Currently Termination Manager write current step's done count for each term if reset is detected. This leads to two problem. 1. User sees different counts just by varying num_envs 2. the count can be over-count or under-count depending on when reset happens, as pointed out in #2977 (Thanks, @Kyu3224) The cause of the bug is because we are reporting current step status into a buffer that suppose to record episodic done. So instead of write the entire buffer base on current value, we ask the update to respect the non-reseting environment's old value, and instead of reporting count, we report percentage of environment that was done due to the particular term. Test on Isaac-Velocity-Rough-Anymal-C-v0 Before fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-20" src="https://github.com/user-attachments/assets/4838d612-7f0e-4232-a07e-688b547e91db" /> Red: num_envs = 4096, Orange: num_envs = 1024 After fix: <img width="786" height="323" alt="Screenshot from 2025-08-06 22-16-12" src="https://github.com/user-attachments/assets/e6e55c21-17ed-42ca-8d94-a19d08611f86" /> Red: num_envs = 4096, Orange: num_envs = 1024 Note that curve of the same color ran on same seed, and curves matched exactly, the only difference is the data gets reported in termination. The percentage version is a lot more clear in conveying how agent currently fails, and how much percentage of agent fails, and shows that increasing num_envs to 4096 helps improve agent avoiding termination by `base_contact` much quicker than num_envs=1024. Such message is a bit hard to tell in first image.  ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [ ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
8dabd3f1 · ooctipus · GitHub · f70e695f · 8dabd3f1 · 8dabd3f1
Unverified Commit 8dabd3f1 authored Aug 14, 2025 by ooctipus Committed by GitHub Aug 14, 2025
Showing with 27 additions and 13 deletions

extension.toml source/isaaclab/config/extension.toml +1 -1

CHANGELOG.rst source/isaaclab/docs/CHANGELOG.rst +11 -0

termination_manager.py source/isaaclab/isaaclab/managers/termination_manager.py +15 -12

No files found.
--- a/source/isaaclab/config/extension.toml
+++ b/source/isaaclab/config/extension.toml
 [package]

 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.44.9"
+version = "0.44.10"

 # Description
 title = "Isaac Lab framework for Robot Learning"

--- a/source/isaaclab/docs/CHANGELOG.rst
+++ b/source/isaaclab/docs/CHANGELOG.rst
 Changelog
 ---------

+0.44.10 (2025-08-06)
+~~~~~~~~~~~~~~~~~~~
+
+Fixed
+^^^^^
+
+* The old termination manager in :class:`~isaaclab.managers.TerminationManager` term_done logging logs the instantaneous
+term done count at reset. This let to inaccurate aggregation of termination count, obscuring the what really happening
+during the traing. Instead we log the episodic term done.
+
+
 0.44.9 (2025-07-30)
 ~~~~~~~~~~~~~~~~~~~


--- a/source/isaaclab/isaaclab/managers/termination_manager.py
+++ b/source/isaaclab/isaaclab/managers/termination_manager.py
@@ -60,10 +60,9 @@ class TerminationManager(ManagerBase):

        # call the base class constructor (this will parse the terms config)
        super().__init__(cfg, env)
+        self._term_name_to_term_idx = {name: i for i, name in enumerate(self._term_names)}
        # prepare extra info to store individual termination term information
-        self._term_dones = dict()
-        for term_name in self._term_names:
-            self._term_dones[term_name] = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
+        self._term_dones = torch.zeros((self.num_envs, len(self._term_names)), device=self.device, dtype=torch.bool)
        # create buffer for managing termination per environment
        self._truncated_buf = torch.zeros(self.num_envs, device=self.device, dtype=torch.bool)
        self._terminated_buf = torch.zeros_like(self._truncated_buf)
@@ -139,9 +138,10 @@ class TerminationManager(ManagerBase):
            env_ids = slice(None)
        # add to episode dict
        extras = {}
-        for key in self._term_dones.keys():
+        last_episode_done_stats = self._term_dones.float().mean(dim=0)
+        for i, key in enumerate(self._term_names):
            # store information
-            extras["Episode_Termination/" + key] = torch.count_nonzero(self._term_dones[key][env_ids]).item()
+            extras["Episode_Termination/" + key] = last_episode_done_stats[i].item()
        # reset all the reward terms
        for term_cfg in self._class_term_cfgs:
            term_cfg.func.reset(env_ids=env_ids)
@@ -161,7 +161,7 @@ class TerminationManager(ManagerBase):
        self._truncated_buf[:] = False
        self._terminated_buf[:] = False
        # iterate over all the termination terms
-        for name, term_cfg in zip(self._term_names, self._term_cfgs):
+        for i, term_cfg in enumerate(self._term_cfgs):
            value = term_cfg.func(self._env, **term_cfg.params)
            # store timeout signal separately
            if term_cfg.time_out:
@@ -169,7 +169,10 @@ class TerminationManager(ManagerBase):
            else:
                self._terminated_buf |= value
            # add to episode dones
-            self._term_dones[name][:] = value
+            rows = value.nonzero(as_tuple=True)[0]  # indexing is cheaper than boolean advance indexing
+            if rows.numel() > 0:
+                self._term_dones[rows] = False
+                self._term_dones[rows, i] = True
        # return combined termination signal
        return self._truncated_buf | self._terminated_buf

@@ -182,7 +185,7 @@ class TerminationManager(ManagerBase):
        Returns:
            The corresponding termination term value. Shape is (num_envs,).
        """
-        return self._term_dones[name]
+        return self._term_dones[:, self._term_name_to_term_idx[name]]

    def get_active_iterable_terms(self, env_idx: int) -> Sequence[tuple[str, Sequence[float]]]:
        """Returns the active terms as iterable sequence of tuples.
@@ -196,8 +199,8 @@ class TerminationManager(ManagerBase):
            The active terms.
        """
        terms = []
-        for key in self._term_dones.keys():
-            terms.append((key, [self._term_dones[key][env_idx].float().cpu().item()]))
+        for i, key in enumerate(self._term_names):
+            terms.append((key, [self._term_dones[env_idx, i].float().cpu().item()]))
        return terms

    """
@@ -217,7 +220,7 @@ class TerminationManager(ManagerBase):
        if term_name not in self._term_names:
            raise ValueError(f"Termination term '{term_name}' not found.")
        # set the configuration
-        self._term_cfgs[self._term_names.index(term_name)] = cfg
+        self._term_cfgs[self._term_name_to_term_idx[term_name]] = cfg

    def get_term_cfg(self, term_name: str) -> TerminationTermCfg:
        """Gets the configuration for the specified term.
@@ -234,7 +237,7 @@ class TerminationManager(ManagerBase):
        if term_name not in self._term_names:
            raise ValueError(f"Termination term '{term_name}' not found.")
        # return the configuration
-        return self._term_cfgs[self._term_names.index(term_name)]
+        return self._term_cfgs[self._term_name_to_term_idx[term_name]]

    """
    Helper functions.