Unverified Commit 6131a573 authored by ooctipus's avatar ooctipus Committed by GitHub

Fixes SB3's template ppo cfg up to date with security-safe syntax for training...

Fixes SB3's template ppo cfg up to date with security-safe syntax for training specification (#3688)

# Description

This PR fixes the bug where if template is generated using SB3, the code
does not run because it couldn't parse from string
```
policy_kwargs: "dict(
                  activation_fn=nn.ELU,
                  net_arch=[32, 32],
                  squash_output=False,
                )"
```

We have disabled the string parsing, as it is not safe(aka arbitrary
code could be parsed)

this PR makes sure the sb3's template also adopt the new secure syntax

```
policy_kwargs:
  activation_fn: nn.ELU
  net_arch: [32, 32]
  squash_output: False
```

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
parent f52aa980
...@@ -11,11 +11,10 @@ n_epochs: 20 ...@@ -11,11 +11,10 @@ n_epochs: 20
ent_coef: 0.01 ent_coef: 0.01
learning_rate: !!float 3e-4 learning_rate: !!float 3e-4
clip_range: !!float 0.2 clip_range: !!float 0.2
policy_kwargs: "dict( policy_kwargs:
activation_fn=nn.ELU, activation_fn: nn.ELU
net_arch=[32, 32], net_arch: [32, 32]
squash_output=False, squash_output: False
)"
vf_coef: 1.0 vf_coef: 1.0
max_grad_norm: 1.0 max_grad_norm: 1.0
device: "cuda:0" device: "cuda:0"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment