-
Clemens Schwarke authored
# Description This PR adds a configuration class to distill a walking policy for ANYmal D as an example. The training is run almost the same way as a normal PPO training. The only difference is that a policy checkpoint needs to be passed via the `--load_run` CLI argument, to serve as the teacher. Additionally, the `RslRlDistillationRunnerCfg` got moved to the correct file. ## Type of change - New feature (non-breaking change which adds functionality) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [ ] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Co-authored-by:
Kelly Guo <kellyg@nvidia.com>
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| index.rst | ||
| performance_benchmarks.rst | ||
| rl_existing_scripts.rst | ||
| rl_frameworks.rst | ||
| training_guide.rst |