protomotions.agents.ase.config module¶
- class protomotions.agents.ase.config.ASEParametersConfig(latent_dim=64, latent_steps_min=1, latent_steps_max=150, mi_reward_w=0.5, mi_hypersphere_reward_shift=True, mi_enc_weight_decay=0, mi_enc_grad_penalty=0, diversity_bonus=0.01, diversity_tar=1.0, latent_uniformity_weight=0.1, uniformity_kernel_scale=1.0)[source]¶
Bases:
ConfigBuilderConfiguration for ASE-specific hyperparameters.
- __init__(latent_dim=64, latent_steps_min=1, latent_steps_max=150, mi_reward_w=0.5, mi_hypersphere_reward_shift=True, mi_enc_weight_decay=0, mi_enc_grad_penalty=0, diversity_bonus=0.01, diversity_tar=1.0, latent_uniformity_weight=0.1, uniformity_kernel_scale=1.0)¶
- class protomotions.agents.ase.config.ASEDiscriminatorEncoderConfig(input_models, _target_='protomotions.agents.ase.model.ASEDiscriminatorEncoder', in_keys=<factory>, out_keys=<factory>, encoder_out_size=None)[source]¶
Bases:
DiscriminatorConfigConfiguration for ASE Discriminator-Encoder network (extends SequentialModuleConfig).
- __init__(input_models, _target_='protomotions.agents.ase.model.ASEDiscriminatorEncoder', in_keys=<factory>, out_keys=<factory>, encoder_out_size=None)¶
- class protomotions.agents.ase.config.ASEAgentConfig(batch_size, training_max_steps, _target_='protomotions.agents.ase.agent.ASE', model=<factory>, num_steps=32, gradient_clip_val=0.0, fail_on_bad_grads=False, check_grad_mag=True, gamma=0.99, bounds_loss_coef=0.0, task_reward_w=1.0, num_mini_epochs=1, training_early_termination=None, save_epoch_checkpoint_every=1000, save_last_checkpoint_every=10, max_episode_length_manager=None, evaluator=<factory>, normalize_rewards=True, normalized_reward_clamp_value=5.0, tau=0.95, e_clip=0.2, clip_critic_loss=True, actor_clip_frac_threshold=0.6, advantage_normalization=<factory>, amp_parameters=<factory>, ase_parameters=<factory>)[source]¶
Bases:
AMPAgentConfigMain configuration class for ASE Agent.
- ase_parameters: ASEParametersConfig¶
- __init__(batch_size, training_max_steps, _target_='protomotions.agents.ase.agent.ASE', model=<factory>, num_steps=32, gradient_clip_val=0.0, fail_on_bad_grads=False, check_grad_mag=True, gamma=0.99, bounds_loss_coef=0.0, task_reward_w=1.0, num_mini_epochs=1, training_early_termination=None, save_epoch_checkpoint_every=1000, save_last_checkpoint_every=10, max_episode_length_manager=None, evaluator=<factory>, normalize_rewards=True, normalized_reward_clamp_value=5.0, tau=0.95, e_clip=0.2, clip_critic_loss=True, actor_clip_frac_threshold=0.6, advantage_normalization=<factory>, amp_parameters=<factory>, ase_parameters=<factory>)¶