Implements deterministic evaluation for skrl's multi-agent algorithms (#1972)

## Description Implement deterministic evaluation for skrl's multi-agent algorithms in `play.py` script (fix https://github.com/isaac-sim/IsaacLab/issues/1935). The current application only takes into account single-agent algorithms. ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there

Implements deterministic evaluation for skrl's multi-agent algorithms (#1972)
## Description Implement deterministic evaluation for skrl's multi-agent algorithms in `play.py` script (fix https://github.com/isaac-sim/IsaacLab/issues/1935). The current application only takes into account single-agent algorithms. ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
1de1b432 · Toni-SM · GitHub · 7b9b4502 · 1de1b432
Unverified Commit 1de1b432 authored Feb 28, 2025 by Toni-SM Committed by GitHub Feb 28, 2025
Show whitespace changes
Inline Side-by-side

Showing with 6 additions and 1 deletion

play.py scripts/reinforcement_learning/skrl/play.py +6 -1

No files found.
--- a/scripts/reinforcement_learning/skrl/play.py
+++ b/scripts/reinforcement_learning/skrl/play.py
@@ -179,6 +179,11 @@ def main():
        with torch.inference_mode():
            # agent stepping
            outputs = runner.agent.act(obs, timestep=0, timesteps=0)
+            # - multi-agent (deterministic) actions
+            if hasattr(env, "possible_agents"):
+                actions = {a: outputs[-1][a].get("mean_actions", outputs[0][a]) for a in env.possible_agents}
+            # - single-agent (deterministic) actions
+            else:
                actions = outputs[-1].get("mean_actions", outputs[0])
            # env stepping
            obs, _, _, _, _ = env.step(actions)