Unverified Commit 6f013fb1 authored by ooctipus's avatar ooctipus Committed by GitHub

Updates SB3 ppo cfg so it trains under reasonable amount of time (#3726)

# Description

This PR fixes the sb3_ppo_cfg for task Isaac-Ant-v0

the parameter before had 4096 num_envs + horizon 512 + batch size 128 +
n_epoch 20,
that means the training one cycle it needs to for loop (20 * 512 * 4096)
/ 128 = 327680 times!

which appears as if it is hanging forever

the new config matches more closely with that of rl_games.

I verified it will trains under 5 min 

[Screencast from 2025-10-15
13-56-21.webm](https://github.com/user-attachments/assets/2bc7bcd8-0063-46b9-adb0-67a6aa686732)

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
parent 47780cf0
[package] [package]
# Note: Semantic Versioning is used: https://semver.org/ # Note: Semantic Versioning is used: https://semver.org/
version = "0.4.1" version = "0.4.2"
# Description # Description
title = "Isaac Lab RL" title = "Isaac Lab RL"
......
Changelog Changelog
--------- ---------
0.4.2 (2025-10-15)
~~~~~~~~~~~~~~~~~~
Fixed
^^^^^
* Isaac-Ant-v0's sb3_ppo_cfg default value, so it trains under reasonable amount of time.
0.4.1 (2025-09-09) 0.4.1 (2025-09-09)
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
......
...@@ -6,19 +6,19 @@ ...@@ -6,19 +6,19 @@
# Reference: https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/ppo.yml#L161 # Reference: https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/ppo.yml#L161
seed: 42 seed: 42
n_timesteps: !!float 1e7 n_timesteps: !!float 1e8
policy: 'MlpPolicy' policy: 'MlpPolicy'
batch_size: 128 batch_size: 32768
n_steps: 512 n_steps: 16
gamma: 0.99 gamma: 0.99
gae_lambda: 0.9 gae_lambda: 0.9
n_epochs: 20 n_epochs: 4
ent_coef: 0.0 ent_coef: 0.0
sde_sample_freq: 4 sde_sample_freq: 4
max_grad_norm: 0.5 max_grad_norm: 0.5
vf_coef: 0.5 vf_coef: 0.5
learning_rate: !!float 3e-5 learning_rate: !!float 3e-5
use_sde: True use_sde: False
clip_range: 0.4 clip_range: 0.4
device: "cuda:0" device: "cuda:0"
policy_kwargs: policy_kwargs:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment