Fixes SB3's template ppo cfg up to date with security-safe syntax for training... (6131a573) · Commits · kevin / KincoActuatorIsaacLab

Unverified Commit 6131a573 authored Oct 14, 2025 by

ooctipus Committed by GitHub Oct 14, 2025

Fixes SB3's template ppo cfg up to date with security-safe syntax for training...

Fixes SB3's template ppo cfg up to date with security-safe syntax for training specification (#3688)

# Description

This PR fixes the bug where if template is generated using SB3, the code
does not run because it couldn't parse from string
```
policy_kwargs: "dict(
                  activation_fn=nn.ELU,
                  net_arch=[32, 32],
                  squash_output=False,
                )"
```

We have disabled the string parsing, as it is not safe(aka arbitrary
code could be parsed)

this PR makes sure the sb3's template also adopt the new secure syntax

```
policy_kwargs:
  activation_fn: nn.ELU
  net_arch: [32, 32]
  squash_output: False
```

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

parent f52aa980

Hide whitespace changes

Inline Side-by-side

View file @ 6131a573

@@ -11,11 +11,10 @@ n_epochs: 20
 ent_coef: 0.01
 learning_rate: !!float 3e-4
 clip_range: !!float 0.2
 policy_kwargs: "dict(
                   activation_fn=nn.ELU,
                   net_arch=[32, 32],
                   squash_output=False,
                 )"
 policy_kwargs:
   activation_fn: nn.ELU
   net_arch: [32, 32]
   squash_output: False
 vf_coef: 1.0
 max_grad_norm: 1.0
 device: "cuda:0"

Please register or to comment