Add per-job hf_token/hf_org override for fine-tuning jobs by timf34 · Pull Request #61 · longtermrisk/openweights

timf34 · 2026-04-23T12:49:44Z

TLDR; enables you to easily override with your own personal HF TOKEN for uploading

Claude's description:
Fine-tuning jobs previously read HF_TOKEN from the worker pod's env and used the org-level hf_org for the {org_id} slot in finetuned_model_id. In shared OpenWeights orgs this meant one user could not route their uploads to their own HF namespace without changing org-wide secrets (affecting other users' in-flight jobs).

Add optional hf_token and hf_org fields to TrainingConfig. When set:

The worker's push_model uses cfg.hf_token for all four upload calls (push_to_hub_merged, push_to_hub, tokenizer.push_to_hub, HfApi).
The client's FineTuning.create uses hf_org for the {org_id} template slot.

The base model download still uses the pod-env HF_TOKEN, so gated-model access (Llama, etc.) keeps working. Defaults are unchanged — jobs without the override behave exactly as before, so existing users' flows are unaffected.

compute_id now shallow-copies validated_params before filtering. The existing filter mutates its input to exclude default-valued fields from the content hash; without the copy, popping hf_token/hf_org for the hash also stripped them from the stored job params (so the worker would never see the override).

Fine-tuning jobs previously read HF_TOKEN from the worker pod's env and used the org-level hf_org for the {org_id} slot in finetuned_model_id. In shared OpenWeights orgs this meant one user could not route their uploads to their own HF namespace without changing org-wide secrets (affecting other users' in-flight jobs). Add optional hf_token and hf_org fields to TrainingConfig. When set: - The worker's push_model uses cfg.hf_token for all four upload calls (push_to_hub_merged, push_to_hub, tokenizer.push_to_hub, HfApi). - The client's FineTuning.create uses hf_org for the {org_id} template slot. The base model download still uses the pod-env HF_TOKEN, so gated-model access (Llama, etc.) keeps working. Defaults are unchanged — jobs without the override behave exactly as before, so existing users' flows are unaffected. compute_id now shallow-copies validated_params before filtering. The existing filter mutates its input to exclude default-valued fields from the content hash; without the copy, popping hf_token/hf_org for the hash also stripped them from the stored job params (so the worker would never see the override).

timf34 · 2026-04-23T12:50:40Z

Probably not something to merge into main but for a quick workaround... though it could be good to include similar functionality as a flag

End-to-end pattern for N-job sweeps via the SDK: dataset x hyperparam matrices, idempotent submission via content-hashed job IDs, persisted manifests, dry-run validation, and downstream inference + download. Linked from cookbook/README.md alongside the custom-job entry. uv.lock refreshed to current resolver state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per-job hf_token/hf_org override for fine-tuning jobs#61

Add per-job hf_token/hf_org override for fine-tuning jobs#61
timf34 wants to merge 2 commits into
mainfrom
tim/hf-override-per-job

timf34 commented Apr 23, 2026

Uh oh!

timf34 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

timf34 commented Apr 23, 2026

Uh oh!

timf34 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant