Domain Randomization & Sim2Sim¶

This workflow covers training with domain randomization for robust policies that transfer across simulators (sim2sim) or to real robots (sim2real).

Why Domain Randomization?¶

Policies trained in simulation often fail when deployed to different physics engines or real hardware due to the “reality gap”. Domain randomization addresses this by:

Randomizing physics parameters during training (friction, mass, etc.)
Adding noise to actions and observations
Forcing the policy to be robust to parameter variations

Training with Domain Randomization¶

Use the mlp_domain_rand.py experiment config:

python protomotions/train_agent.py \
    --robot-name g1 \
    --simulator isaacgym \
    --experiment-path examples/experiments/mimic/mlp_domain_rand.py \
    --experiment-name g1_amass_dr \
    --motion-file /path/to/amass_g1.pt \
    --num-envs 8192 \
    --batch-size 8192 \
    --ngpu 4

Domain Randomization Parameters¶

The mlp_domain_rand.py config enables several randomization types:

Action Noise:

ActionNoiseDomainRandomizationConfig(
    action_noise_range=(-0.02, 0.02),  # ±2% noise on actions
    dof_names=[".*"],  # Apply to all joints
)

Friction Randomization:

FrictionDomainRandomizationConfig(
    num_buckets=64,  # Number of friction groups
    static_friction_range=(0.6, 3.0),
    dynamic_friction_range=(0.6, 3.0),
    restitution_range=(0.0, 1.0),
    body_names=[".*"],  # Apply to all bodies
)

Note

Friction Combine Mode: In physics simulators, friction between two surfaces is computed from both materials. The mlp_domain_rand.py config sets the floor friction to near-zero (0.01) with CombineMode.AVERAGE. This means the effective friction is approximately half of the robot body’s friction value.

With robot friction randomized in the range (0.6, 3.0) and floor at 0.01:

Effective friction range: ~(0.3, 1.5)

This approach lets you control the full friction range through robot body randomization while keeping the floor constant.

Center of Mass Randomization:

CenterOfMassDomainRandomizationConfig(
    com_range={"x": (-0.05, 0.05), "y": (-0.05, 0.05), "z": (-0.05, 0.05)},
    body_names=["torso_link"],  # Apply to torso
)

Sim2Sim Testing¶

After training with DR, test on different simulators to verify transfer:

Test on Newton (MuJoCo-based):

python protomotions/inference_agent.py \
    --checkpoint results/g1_amass_dr/last.ckpt \
    --simulator newton

Note

Newton is currently in beta. You may observe physics artifacts as we have not yet spent significant time tuning its solver parameters. Community contributions to improve Newton’s physics fidelity are welcome!

If the policy works across simulators, it has learned robust dynamics rather than overfitting to IsaacGym’s specific physics.

ONNX Export for Deployment¶

Export trained policy to ONNX for deployment:

python scripts/export_model_to_onnx.py \
    --checkpoint results/g1_amass_dr/last.ckpt \
    --output-path g1_policy.onnx

The ONNX model can be loaded in C++ or other frameworks for robot deployment.

Training Tips¶

Start without DR: Train a baseline without domain randomization first. This confirms your motion data and rewards are working. We did not find training becomes harder with DR in our experiments though.

Observation history: DR configs often use observation history to help the policy infer physics parameters:

max_coords_obs=MaxCoordsSelfObsConfig(
    enabled=True,
    num_historical_steps=3,  # 3 steps of history
)

Full Pipeline: Train → DR → Sim2Sim¶

Baseline training (no DR):

python protomotions/train_agent.py \
    --experiment-path examples/experiments/mimic/mlp.py \
    --experiment-name g1_baseline \
    ...

DR training:

python protomotions/train_agent.py \
    --experiment-path examples/experiments/mimic/mlp_domain_rand.py \
    --experiment-name g1_dr \
    ...

Sim2sim test:

# Test both policies on Newton
python protomotions/inference_agent.py \
    --checkpoint results/g1_baseline/last.ckpt \
    --simulator newton

python protomotions/inference_agent.py \
    --checkpoint results/g1_dr/last.ckpt \
    --simulator newton

Compare: The DR policy should perform better on Newton than the baseline.

Next Steps¶

Adding a Custom Robot - Add your robot for DR training
Configuration System - More on config overrides
Core Abstractions - Understand simulator abstraction