Skip to content

Fix two test failures from transformers v5 support#491

Merged
jlamypoirier merged 1 commit intomainfrom
worktree-fixes
Apr 24, 2026
Merged

Fix two test failures from transformers v5 support#491
jlamypoirier merged 1 commit intomainfrom
worktree-fixes

Conversation

@jlamypoirier
Copy link
Copy Markdown
Collaborator

Summary

  • modeling_apriel2.py: Guard _init_weights under if _TRANSFORMERS_V4. In transformers v5, from_pretrained calls initialize_weights() after loading the checkpoint, which re-invokes _init_weights on every module via smart_apply. The raw .data.normal_() calls bypass guard_torch_init_functions patching (which gates on _is_hf_initialized), so all loaded weights were silently overwritten with fresh random values. In v5 the inherited PreTrainedModel._init_weights default already uses init.* functions correctly and handles nn.Linear, nn.Embedding, and *RMSNorm modules.

  • test_lm_head.py: Add num_documents_in_batch=1 to GRPO test kwargs. LanguageModelGRPOLoss._forward_backward reads this key when computing the new_logprobs metric, but the test never populated it.

Test plan

  • pytest -v tests/layers/test_lm_head.py -k grpo — all 4 GRPO variants pass
  • pytest -v tests/models/test_hf_roundtrip.py -k apriel2 — both apriel2 roundtrip cases pass
  • pytest -v -n 8 tests/ — 2007 passed, 0 failed

🤖 Generated with Claude Code

- modeling_apriel2.py: Guard _init_weights under _TRANSFORMERS_V4. In v5,
  from_pretrained calls initialize_weights() after loading, which re-invokes
  _init_weights via smart_apply. The raw .data.normal_() calls bypass the
  guard_torch_init_functions patching that respects _is_hf_initialized,
  clobbering all loaded weights. In v5 the inherited PreTrainedModel default
  (which uses init.* functions) handles all cases correctly.

- test_lm_head.py: Add num_documents_in_batch=1 to GRPO test kwargs, which
  LanguageModelGRPOLoss._forward_backward requires when computing the
  new_logprobs metric.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jlamypoirier jlamypoirier merged commit 4c38894 into main Apr 24, 2026
1 of 2 checks passed
@jlamypoirier jlamypoirier deleted the worktree-fixes branch April 24, 2026 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant