Unverified Commit a77910ba authored by Toni-SM's avatar Toni-SM Committed by GitHub

Fixes skrl train/play script configurations when using the `--agent` argument...

Fixes skrl train/play script configurations when using the `--agent` argument and rename agent configuration variable  (#3643)

# Description

This PR address the following points:
* Fix skrl train/play script configuration when using the `--agent`
argument

    Example:

    ```bash
python scripts/reinforcement_learning/skrl/train.py --task
Isaac-Cart-Double-Pendulum-Direct-v0 --headless --agent
skrl_mappo_cfg_entry_point
    ```

    Error:

    ```
[INFO]: Parsing configuration from:
isaaclab_tasks.direct.cart_double_pendulum.cart_double_pendulum_env:CartDoublePendulumEnvCfg
[INFO]: Parsing configuration from:
/home/toni/Documents/RL/toni_IsaacLab/source/isaaclab_tasks/isaaclab_tasks/direct/cart_double_pendulum/agents/skrl_mappo_cfg.yaml
[INFO] Logging experiment in directory:
/home/toni/Documents/RL/toni_IsaacLab/logs/skrl/cart_double_pendulum_direct
    Error executing job with overrides: []
    Traceback (most recent call last):
File
"/home/toni/Documents/RL/toni_IsaacLab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py",
line 101, in hydra_main
        func(env_cfg, agent_cfg, *args, **kwargs)
File
"/home/toni/Documents/RL/toni_IsaacLab/scripts/reinforcement_learning/skrl/train.py",
line 156, in main
log_dir = datetime.now().strftime("%Y-%m-%d_%H-%M-%S") +
f"_{algorithm}_{args_cli.ml_framework}"
^^^^^^^^^
    NameError: name 'algorithm' is not defined
    ```
 
* Replace `STATES` by `OBSERVATIONS` when defining skrl's agent
configuration model inputs to ensure a smooth and error-free transition
when the new mayor version of **skrl** gets released. In such mayor
version `OBSERVATIONS` and `STATES` have different value/usage.

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)
parent a8cec21c
...@@ -121,6 +121,7 @@ if args_cli.agent is None: ...@@ -121,6 +121,7 @@ if args_cli.agent is None:
agent_cfg_entry_point = "skrl_cfg_entry_point" if algorithm in ["ppo"] else f"skrl_{algorithm}_cfg_entry_point" agent_cfg_entry_point = "skrl_cfg_entry_point" if algorithm in ["ppo"] else f"skrl_{algorithm}_cfg_entry_point"
else: else:
agent_cfg_entry_point = args_cli.agent agent_cfg_entry_point = args_cli.agent
algorithm = agent_cfg_entry_point.split("_cfg")[0].split("skrl_")[-1].lower()
@hydra_task_config(args_cli.task, agent_cfg_entry_point) @hydra_task_config(args_cli.task, agent_cfg_entry_point)
......
...@@ -119,6 +119,7 @@ if args_cli.agent is None: ...@@ -119,6 +119,7 @@ if args_cli.agent is None:
agent_cfg_entry_point = "skrl_cfg_entry_point" if algorithm in ["ppo"] else f"skrl_{algorithm}_cfg_entry_point" agent_cfg_entry_point = "skrl_cfg_entry_point" if algorithm in ["ppo"] else f"skrl_{algorithm}_cfg_entry_point"
else: else:
agent_cfg_entry_point = args_cli.agent agent_cfg_entry_point = args_cli.agent
algorithm = agent_cfg_entry_point.split("_cfg")[0].split("skrl_")[-1].lower()
@hydra_task_config(args_cli.task, agent_cfg_entry_point) @hydra_task_config(args_cli.task, agent_cfg_entry_point)
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512, 256, 128] layers: [1024, 512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512, 256, 128] layers: [1024, 512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: features_extractor - name: features_extractor
input: permute(STATES, (0, 3, 1, 2)) # PyTorch NHWC -> NCHW. Warning: don't permute for JAX since it expects NHWC input: permute(OBSERVATIONS, (0, 3, 1, 2)) # PyTorch NHWC -> NCHW. Warning: don't permute for JAX since it expects NHWC
layers: layers:
- conv2d: {out_channels: 32, kernel_size: 8, stride: 4, padding: 0} - conv2d: {out_channels: 32, kernel_size: 8, stride: 4, padding: 0}
- conv2d: {out_channels: 64, kernel_size: 4, stride: 2, padding: 0} - conv2d: {out_channels: 64, kernel_size: 4, stride: 2, padding: 0}
...@@ -36,7 +36,7 @@ models: ...@@ -36,7 +36,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: features_extractor - name: features_extractor
input: permute(STATES, (0, 3, 1, 2)) # PyTorch NHWC -> NCHW. Warning: don't permute for JAX since it expects NHWC input: permute(OBSERVATIONS, (0, 3, 1, 2)) # PyTorch NHWC -> NCHW. Warning: don't permute for JAX since it expects NHWC
layers: layers:
- conv2d: {out_channels: 32, kernel_size: 8, stride: 4, padding: 0} - conv2d: {out_channels: 32, kernel_size: 8, stride: 4, padding: 0}
- conv2d: {out_channels: 64, kernel_size: 4, stride: 2, padding: 0} - conv2d: {out_channels: 64, kernel_size: 4, stride: 2, padding: 0}
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [400, 200, 100] layers: [400, 200, 100]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [400, 200, 100] layers: [400, 200, 100]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -20,7 +20,7 @@ models: ...@@ -20,7 +20,7 @@ models:
fixed_log_std: True fixed_log_std: True
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ACTIONS output: ACTIONS
...@@ -29,7 +29,7 @@ models: ...@@ -29,7 +29,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
...@@ -38,7 +38,7 @@ models: ...@@ -38,7 +38,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
......
...@@ -20,7 +20,7 @@ models: ...@@ -20,7 +20,7 @@ models:
fixed_log_std: True fixed_log_std: True
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ACTIONS output: ACTIONS
...@@ -29,7 +29,7 @@ models: ...@@ -29,7 +29,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
...@@ -38,7 +38,7 @@ models: ...@@ -38,7 +38,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
......
...@@ -20,7 +20,7 @@ models: ...@@ -20,7 +20,7 @@ models:
fixed_log_std: True fixed_log_std: True
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ACTIONS output: ACTIONS
...@@ -29,7 +29,7 @@ models: ...@@ -29,7 +29,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
...@@ -38,7 +38,7 @@ models: ...@@ -38,7 +38,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [400, 400, 200, 100] layers: [400, 400, 200, 100]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 512, 256, 128] layers: [512, 512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [400, 200, 100] layers: [400, 200, 100]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [400, 200, 100] layers: [400, 200, 100]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 128] layers: [256, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 128] layers: [256, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128, 128] layers: [128, 128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [512, 256, 128] layers: [512, 256, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [256, 128, 64] layers: [256, 128, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [64, 64] layers: [64, 64]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -19,7 +19,7 @@ models: ...@@ -19,7 +19,7 @@ models:
initial_log_std: -0.6931471805599453 initial_log_std: -0.6931471805599453
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128] layers: [128, 128]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -28,7 +28,7 @@ models: ...@@ -28,7 +28,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [128, 128] layers: [128, 128]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -15,7 +15,7 @@ models: ...@@ -15,7 +15,7 @@ models:
fixed_log_std: True fixed_log_std: True
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ACTIONS output: ACTIONS
...@@ -24,7 +24,7 @@ models: ...@@ -24,7 +24,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
...@@ -33,7 +33,7 @@ models: ...@@ -33,7 +33,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [1024, 512] layers: [1024, 512]
activations: relu activations: relu
output: ONE output: ONE
......
...@@ -14,7 +14,7 @@ models: ...@@ -14,7 +14,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -23,7 +23,7 @@ models: ...@@ -23,7 +23,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -14,7 +14,7 @@ models: ...@@ -14,7 +14,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -23,7 +23,7 @@ models: ...@@ -23,7 +23,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
...@@ -14,7 +14,7 @@ models: ...@@ -14,7 +14,7 @@ models:
initial_log_std: 0.0 initial_log_std: 0.0
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ACTIONS output: ACTIONS
...@@ -23,7 +23,7 @@ models: ...@@ -23,7 +23,7 @@ models:
clip_actions: False clip_actions: False
network: network:
- name: net - name: net
input: STATES input: OBSERVATIONS
layers: [32, 32] layers: [32, 32]
activations: elu activations: elu
output: ONE output: ONE
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment