protomotions.agents.base_agent.config module¶
Configuration classes for base agent.
This module defines the configuration dataclasses used by the base agent and all derived agents. These configurations specify training parameters, optimization settings, and evaluation parameters.
- Key Classes:
BaseAgentConfig: Main agent configuration
BaseModelConfig: Model architecture configuration
OptimizerConfig: Optimizer parameters
MaxEpisodeLengthManagerConfig: Episode length curriculum
- class protomotions.agents.base_agent.config.MaxEpisodeLengthManagerConfig(start_length=5, end_length=300, transition_epochs=100000)[source]¶
Bases:
ConfigBuilderConfiguration for managing max episode length during training.
- current_max_episode_length(current_epoch)[source]¶
Returns the current max episode length based on linear interpolation.
- Parameters:
current_step – Current step in the episode
- Returns:
Interpolated max episode length
- Return type:
- __init__(start_length=5, end_length=300, transition_epochs=100000)¶
- class protomotions.agents.base_agent.config.OptimizerConfig(_target_='torch.optim.Adam', lr=0.0001, weight_decay=0.0, eps=1e-08, betas=<factory>)[source]¶
Bases:
ConfigBuilderConfiguration for optimizers.
- __init__(_target_='torch.optim.Adam', lr=0.0001, weight_decay=0.0, eps=1e-08, betas=<factory>)¶
- class protomotions.agents.base_agent.config.BaseModelConfig(_target_='protomotions.agents.base_agent.model.BaseModel', in_keys=<factory>, out_keys=<factory>)[source]¶
Bases:
ConfigBuilderConfiguration for PPO Model (Actor-Critic).
- __init__(_target_='protomotions.agents.base_agent.model.BaseModel', in_keys=<factory>, out_keys=<factory>)¶
- class protomotions.agents.base_agent.config.BaseAgentConfig(batch_size, training_max_steps, _target_='protomotions.agents.base_agent.agent.BaseAgent', model=<factory>, num_steps=32, gradient_clip_val=0.0, fail_on_bad_grads=False, check_grad_mag=True, gamma=0.99, bounds_loss_coef=0.0, task_reward_w=1.0, num_mini_epochs=1, training_early_termination=None, save_epoch_checkpoint_every=1000, save_last_checkpoint_every=10, max_episode_length_manager=None, evaluator=<factory>, normalize_rewards=True, normalized_reward_clamp_value=5.0)[source]¶
Bases:
ConfigBuilderMain configuration class for PPO Agent.
- model: BaseModelConfig¶
- max_episode_length_manager: MaxEpisodeLengthManagerConfig | None = None¶
- evaluator: EvaluatorConfig¶
- __init__(batch_size, training_max_steps, _target_='protomotions.agents.base_agent.agent.BaseAgent', model=<factory>, num_steps=32, gradient_clip_val=0.0, fail_on_bad_grads=False, check_grad_mag=True, gamma=0.99, bounds_loss_coef=0.0, task_reward_w=1.0, num_mini_epochs=1, training_early_termination=None, save_epoch_checkpoint_every=1000, save_last_checkpoint_every=10, max_episode_length_manager=None, evaluator=<factory>, normalize_rewards=True, normalized_reward_clamp_value=5.0)¶