protomotions.agents.evaluators.base_evaluator module¶

Base evaluator for agent evaluation and metrics computation.

This module provides the base evaluation infrastructure for computing performance metrics during training and evaluation. Evaluators run periodic assessments of agent performance and compute task-specific metrics.

Key Classes:

BaseEvaluator: Base class for all evaluators
SmoothnessMetricPlugin: Plugin for computing motion smoothness metrics

Key Features:

Periodic evaluation during training
Motion quality metrics computation
Episode statistics aggregation
Smoothness and jerk analysis
Distributed evaluation support

class protomotions.agents.evaluators.base_evaluator.SmoothnessMetricPlugin(evaluator, window_sec=0.4, high_jerk_threshold=6500.0)[source]¶

Bases: object

Plugin for computing smoothness metrics from motion data.

__init__(evaluator, window_sec=0.4, high_jerk_threshold=6500.0)[source]¶

Initialize the smoothness metric plugin.

Parameters:

evaluator – The parent evaluator instance
window_sec (float) – Window size in seconds for smoothness computation
high_jerk_threshold (float) – Threshold for classifying high jerk frames

compute(metrics)[source]¶

Compute smoothness metrics from collected motion data.

Parameters:: metrics (Dict[str, MotionMetrics]) – Dictionary of MotionMetrics
Returns:: Dictionary of smoothness metrics with “eval/” prefix
Return type:: Dict[str, float]

class protomotions.agents.evaluators.base_evaluator.BaseEvaluator(agent, fabric, config)[source]¶

Bases: object

Base class for agent evaluation and metrics computation.

Runs periodic evaluations during training to assess agent performance. Collects episode statistics, computes task-specific metrics, and provides feedback for checkpoint selection (best model saving).

Parameters:

agent (Any) – The agent being evaluated.
fabric (MockFabric) – Lightning Fabric instance for distributed evaluation.
config (EvaluatorConfig) – Evaluator configuration specifying eval frequency and length.

Example

>>> evaluator = BaseEvaluator(agent, fabric, config)
>>> metrics, score = evaluator.evaluate()

__init__(agent, fabric, config)[source]¶

Initialize the evaluator.

Parameters:

agent (Any) – The agent to evaluate
fabric (MockFabric) – Lightning Fabric instance for distributed training

property device: <Mock object at 0x7343244db110>[]¶: Device for computations (from fabric).

property env: BaseEnv¶: Environment instance (from agent).

property root_dir¶: Root directory for saving outputs (from agent).

initialize_eval()[source]¶

Initialize metrics dictionary with required keys. Prepare the evaluation context.

Returns:: Tuple containing metrics dict and evaluation context dict
Return type:: Tuple[Dict, Dict]

run_evaluation(metrics)[source]¶

Run the evaluation process and collect metrics.

Parameters:: metrics (Dict) – Dictionary to collect evaluation metrics

process_eval_results(metrics, eval_context)[source]¶

Process collected metrics and prepare for logging.

Parameters:

metrics (Dict) – Dictionary of collected metrics
eval_context (Dict) – Dictionary containing evaluation context

Returns:

Dict of processed metrics for logging
Optional score value for determining best model

Return type:

Tuple containing

cleanup_after_evaluation()[source]¶

Clean up after evaluation (reset env state, etc.)

simple_test_policy(collect_metrics=False)[source]¶

Simple evaluation loop for testing the policy.

Parameters:: collect_metrics (bool) – whether to collect metrics during evaluation