feat: add HuggingFace Hub integration#452
Open
Abhijeet Prasad (AbhiPrasad) wants to merge 1 commit into
Open
Conversation
Adds native Braintrust tracing for the Hugging Face Hub Python SDK
(`huggingface_hub`) via the integrations API. The integration supports
`huggingface-hub>=0.32.0` and is included in `auto_instrument()` by default.
How to enable it:
```python
import braintrust
from braintrust import init_logger
from huggingface_hub import InferenceClient
logger = init_logger(project="my-project")
braintrust.auto_instrument() # enables huggingface_hub unless disabled
client = InferenceClient(provider="auto")
with logger.start_span(name="hf request"):
response = client.chat_completion(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user", "content": "Say hello"}],
max_tokens=32,
)
```
The integration can also be enabled explicitly or disabled from the global
auto-instrumentation call:
```python
from braintrust.auto import auto_instrument
auto_instrument(huggingface_hub=True)
auto_instrument(huggingface_hub=False) # opt out
```
For manual wrapping, wrap individual sync or async clients:
```python
from braintrust.integrations.huggingface_hub import wrap_huggingface_hub
from huggingface_hub import AsyncInferenceClient, InferenceClient
client = wrap_huggingface_hub(InferenceClient(provider="auto"))
async_client = wrap_huggingface_hub(AsyncInferenceClient(provider="auto"))
```
Features added:
- Traces sync and async `InferenceClient` calls for `chat_completion`,
`text_generation`, `feature_extraction`, and `sentence_similarity`.
- Covers the OpenAI-compatible chat alias
`client.chat.completions.create(...)` because it proxies through
`chat_completion`.
- Supports non-streaming and streaming chat completions, including async
streams, context manager finalization, early stream close handling, and
nesting under parent Braintrust spans.
- Captures span inputs, outputs, allowlisted request metadata, provider/model
routing metadata, response identifiers, finish reasons, and token metrics
from Hugging Face response usage/details fields.
- Logs provider errors to the span before re-raising.
- Adds VCR-backed coverage for latest and 0.32.0 Hugging Face Hub SDKs, plus
an auto-instrumentation smoke test.
- Adds the `test_huggingface_hub` nox session, dependency matrix entries, and
cassette-directory mapping for provider-versioned cassette hygiene.
584eeed to
d403163
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
resolves #278
Adds native Braintrust tracing for the Hugging Face Hub Python SDK (
huggingface_hub) via the integrations API. The integration supportshuggingface-hub>=0.32.0and is included inauto_instrument()by default.How to enable it:
The integration can also be enabled explicitly or disabled from the global auto-instrumentation call:
For manual wrapping, wrap individual sync or async clients:
Features added:
InferenceClientcalls forchat_completion,text_generation,feature_extraction, andsentence_similarity.client.chat.completions.create(...)because it proxies throughchat_completion.test_huggingface_hubnox session, dependency matrix entries, and cassette-directory mapping for provider-versioned cassette hygiene.