Fix floating-base fail check replay divergence#49
Merged
stepjam merged 1 commit intoJun 13, 2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix a replay divergence in floating-base environments where the same action sequence could succeed with
env.step(action, fast=True)but fail withenv.step(action, fast=False).Root cause
BiGymEnv._fail()previously checked whether the robot moved too far from the origin by reading the pelvis Cartesian position throughself._robot.pelvis.get_position(). For the pelvis body this accesses MuJoCo's derivedxposfield throughdm_control.When the physics state is dirty, reading a derived MuJoCo field can trigger an implicit
physics.forward(). That extra forward pass does not advance simulation time, but it can recompute derived state, contacts, and constraint data. In contact-heavy tasks such asStackBlocks, doing this during everyfast=Falsestep can perturb the later physical trajectory.Change
For floating-base robots,
_fail()now reconstructs the pelvis XYZ position from the floating-base jointqposvalues instead of reading bodyxpos. For axes that are not actuated, it falls back to the static MJCF bodypos. Non-floating robots keep the previouspelvis.get_position()behavior.This preserves the original fail check semantics while avoiding a derived-field read that can force an extra forward pass.
Validation
python -m py_compile bigym/bigym_env.pyStackBlocksreplay that previously diverged: after this change, thefast=Falsereplay reaches success at frame1344, matching the successful replay path.