Skip to content

[bot] Flue agent framework (@flue/runtime) not instrumented — no tracing for agent sessions, prompts, skills, tasks, or tools #2023

@AbhiPrasad

Description

relevant slack thread: https://braintrustdata.slack.com/archives/C083GCUTVDZ/p1779211586735559

Summary

Flue (@flue/runtime on npm, v0.7.0, ~4.5K weekly downloads; related packages @flue/cli and @flue/sdk are ~14K and ~12K weekly downloads) is an agent harness framework for building autonomous TypeScript agents. It exposes high-level agent execution APIs for sessions, prompts, skills, tasks, shell/tool calls, compaction, and run/event streaming. This repository has zero instrumentation for any Flue surface — no wrapper, no channels, no plugin, and no auto-instrumentation config. Users building agents with Flue get no Braintrust spans around the framework-level agent operations.

What instrumentation is missing

The @flue/runtime package exposes these execution surfaces, none of which are instrumented:

SDK Method / Surface Description
init({...}) from a FlueContext agent handler Creates a harness with model, sandbox, tools, roles, cwd/name, and runtime configuration
harness.session(options?) Starts or resumes a named agent session
session.prompt(text, options?) Primary agent turn / LLM-backed prompt execution, optionally with tools, role, model override, images, and structured result schema
session.skill(name, options?) Runs a Markdown skill with args, tools, role/model overrides, images, and optional structured result
session.task(text, options?) Launches a detached/subagent task session
session.shell(command, options?) Executes shell commands through the configured sandbox and records them in the transcript
session.compact() Runs Flue's context compaction/summarization flow
observe(...) / Flue event stream Emits run_start, operation_start, text_delta, thinking_*, tool_start, tool_call, turn, operation, compaction, run_end, and log events

These APIs represent an agent-orchestration layer rather than a provider-specific LLM client. They sit above direct model SDKs (OpenAI, Anthropic, Vercel AI SDK, etc.) and include framework concepts Braintrust should capture as spans: runs, sessions, prompt/skill/task operations, tool calls, shell execution, compaction, selected model, token/cost usage, errors, and final results.

No coverage in any instrumentation layer:

  • No wrapper function (e.g. wrapFlue() or wrapFlueRuntime())
  • No diagnostics channels for Flue operations or events
  • No plugin handler in js/src/instrumentation/plugins/
  • No auto-instrumentation config in js/src/auto-instrumentations/configs/ targeting @flue/runtime or @flue/sdk
  • No vendored Flue runtime types in js/src/vendor-sdk-types/
  • No e2e test scenarios

A search for flue across js/src/, js/tests/, e2e/scenarios/, and docs/ returns zero matches.

Indirect coverage exists but is limited:

Flue's underlying model calls may be covered when the app uses an already-instrumented provider path. However, that does not capture Flue's framework-level contract: agent run/session boundaries, prompt() vs skill() vs task() operations, sandbox shell calls, tool call lifecycle, compaction summaries, Flue usage aggregation, or streaming event deltas. Users therefore lack a coherent trace of the agent harness even if some lower-level LLM spans happen to exist.

Context

Flue describes itself as "The Agent Harness Framework": a runtime-agnostic TypeScript framework for building headless agents that can run on Node.js, Cloudflare, GitHub Actions, GitLab CI/CD, and other environments. Typical usage is:

import type { FlueContext } from '@flue/runtime';

export default async function ({ init, payload }: FlueContext) {
  const harness = await init({ model: 'anthropic/claude-sonnet-4-6' });
  const session = await harness.session();

  return await session.prompt(`Translate this: ${payload.text}`);
}

The runtime also exposes an observe() API and typed FlueEvent stream, which may be a stable integration point for a plugin because it already reports operation boundaries, tool calls, turns, usage, errors, and run lifecycle events.

Braintrust docs status

not_found — Braintrust does not have a dedicated Flue instrumentation page. Flue is not listed on https://www.braintrust.dev/docs/guides/tracing or the integrations index.

Upstream references

Local files inspected

  • js/src/auto-instrumentations/configs/ — no Flue config entry
  • js/src/instrumentation/plugins/ — no Flue channels or plugin
  • js/src/vendor-sdk-types/ — no Flue vendored types
  • e2e/scenarios/ — no Flue test scenarios
  • Full repo grep for flue in js/src/, js/tests/, e2e/scenarios/, and docs/ — zero matches
  • Cloned withastro/flue and inspected README.md, packages/runtime/package.json, packages/runtime/src/types.ts, packages/runtime/src/session.ts, and packages/runtime/src/runtime/events.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions