Skip to content

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631

Open
viniciusdsmello wants to merge 17 commits intomainfrom
vini/open-10480-embedding-tracer
Open

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631
viniciusdsmello wants to merge 17 commits intomainfrom
vini/open-10480-embedding-tracer

Conversation

@viniciusdsmello
Copy link
Copy Markdown
Contributor

Summary

Adds native embedding tracing support across the Python SDK so that embedding API calls (Titan via Bedrock, litellm.embedding, OpenAI.embeddings.create) generate proper traces in Openlayer with correct model, tokens, dimensions, and output.

  • Data model: new StepType.EMBEDDING + add_embedding_step_to_trace helper (src/openlayer/lib/tracing/).
  • Bedrock: detects "embed" in modelId and routes to a dedicated handler with parsers for Titan v1/v2 and Cohere v3 (single + batch). Existing chat path is untouched and locked in by a backfilled regression test.
  • LiteLLM: patches litellm.embedding alongside the existing litellm.completion patch. Reuses detect_provider_from_response, extract_usage_from_response, and extract_litellm_metadata.
  • OpenAI: patches client.embeddings.create for both sync (trace_openai) and async (trace_async_openai) clients via a small shared helper module (_openai_embedding_common.py).

Linear

OPEN-10480

Verification

  • 34 new unit tests covering all four integration paths (single input, batch, failure isolation, body replay, regression).
  • Full local test suite green (448 tests).
  • ruff check clean on all touched files.
  • pyright clean on all touched source files.

Test plan

  • CI passes on this branch.
  • Manual smoke test with a real Titan embedding call (amazon.titan-embed-text-v2:0) — confirm trace appears with model name and prompt tokens populated.
  • Manual smoke test with litellm.embedding(model="text-embedding-3-small", input="x").
  • Confirm with the ingestion / UI team that step_type=embedding is rendered correctly (out of scope for this PR but required for end-to-end value).

Out of scope

  • Mistral, Gemini, OCI, Portkey embedding tracers — follow-ups using the same pattern.
  • Backend / UI changes to render the new step type.

🤖 Generated with Claude Code

viniciusdsmello and others added 17 commits April 28, 2026 12:47
Used by superpowers workflows to host isolated git worktrees during
implementation, never meant to be tracked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…EN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…N-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…OPEN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gression (OPEN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the same file-level pragma already used by test_portkey_integration.py
to suppress reportUnknown* and reportMissingParameterType — these come from
openlayer.lib.integrations being in pyright's ignore list, which causes
imports from there to be typed as Unknown.

Per-line pyright ignores added on direct imports of botocore.response and
openai, which are not present in the lint job's environment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review findings addressed:
- Move per-call imports of _openai_embedding_common to module-level (was in
  hot path of every embedding call).
- Extract build_embedding_step_kwargs into _openai_embedding_common so that
  sync and async OpenAI handlers each become ~10 lines instead of ~50, and
  LiteLLM reuses the same kwargs assembly.
- Drop LiteLLM's local _parse_embedding_response and
  _get_embedding_model_parameters; both now delegate to the shared helpers
  (LiteLLM-specific timeout/api_base/api_version/cost/metadata are layered
  on top of the common kwargs).
- Type Bedrock _parse_embedding_output return as
  Tuple[Union[List[float], List[List[float]]], int, int] instead of bare
  tuple.

Net: -34 lines across the 5 touched source files. Tests unchanged, all
77 embedding tests + 448 lib tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant