Commit 293a6c2b authored by Kelly Guo's avatar Kelly Guo Committed by David Hoeller

Improves policy behavior of Franka Cabinet and Allegro (#111)

# Description

Improves policy behavior for training Franka Cabinet direct environment
and Repose Cube Allegro direct environments.

## Type of change

- Bug fix (non-breaking change which fixes an issue)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
parent a655ad95
...@@ -24,7 +24,7 @@ from ..sensor_base import SensorBase ...@@ -24,7 +24,7 @@ from ..sensor_base import SensorBase
from .camera import Camera from .camera import Camera
if TYPE_CHECKING: if TYPE_CHECKING:
from .camera_cfg import TiledCameraCfg from .tiled_camera_cfg import TiledCameraCfg
class TiledCamera(Camera): class TiledCamera(Camera):
......
...@@ -30,7 +30,7 @@ params: ...@@ -30,7 +30,7 @@ params:
val: 0 val: 0
fixed_sigma: True fixed_sigma: True
mlp: mlp:
units: [512, 512, 256, 128] units: [1024, 512, 256, 128]
activation: elu activation: elu
d2rl: False d2rl: False
......
...@@ -21,8 +21,8 @@ class AllegroHandPPORunnerCfg(RslRlOnPolicyRunnerCfg): ...@@ -21,8 +21,8 @@ class AllegroHandPPORunnerCfg(RslRlOnPolicyRunnerCfg):
empirical_normalization = True empirical_normalization = True
policy = RslRlPpoActorCriticCfg( policy = RslRlPpoActorCriticCfg(
init_noise_std=1.0, init_noise_std=1.0,
actor_hidden_dims=[512, 512, 256, 128], actor_hidden_dims=[1024, 512, 256, 128],
critic_hidden_dims=[512, 512, 256, 128], critic_hidden_dims=[1024, 512, 256, 128],
activation="elu", activation="elu",
) )
algorithm = RslRlPpoAlgorithmCfg( algorithm = RslRlPpoAlgorithmCfg(
......
...@@ -112,7 +112,7 @@ class AllegroHandEnvCfg(DirectRLEnvCfg): ...@@ -112,7 +112,7 @@ class AllegroHandEnvCfg(DirectRLEnvCfg):
fall_penalty = 0 fall_penalty = 0
fall_dist = 0.24 fall_dist = 0.24
vel_obs_scale = 0.2 vel_obs_scale = 0.2
success_tolerance = 0.1 success_tolerance = 0.2
max_consecutive_success = 0 max_consecutive_success = 0
av_factor = 0.1 av_factor = 0.1
act_moving_average = 1.0 act_moving_average = 1.0
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment