-
Mayank Mittal authored
# Description On termination of an episode, three conditions arise: 1. **bad** terminations (terminated dones): the agent gets a termination penalty 2. **timeout** terminations (truncated dones): * infinite-horizon: bootstrapping by the agent based on terminal state * finite-horizon: no penalty or bootstrapping Currently, we have not handled the last case, which leads to issues when training RL tasks with a finite horizon (for instance, Nikita's agile locomotion work). This MR adds a flag to the RLTaskEnvCfg called `is_finite_horizon` that helps deal with this case. The flag is consumed by the env wrappers to decide how they want to specifically handle the finite horizon problem. ## Type of change - Bug fix (non-breaking change which fixes an issue) - New feature (non-breaking change which adds functionality) ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./orbit.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have run all the tests with `./orbit.sh --test` and they pass - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| extensions | ||
| standalone |