UStackUStack
breadcrumb icon

breadcrumb

breadcrumb is self-hostable, open-source LLM tracing for AI agents in TypeScript—capture prompts, completions, tokens, timing & cost per request.

breadcrumb

What is breadcrumb?

breadcrumb is an open-source system for tracing and inspecting LLM activity in your AI agents. It focuses on capturing the full prompt and completion for each request, along with timing and token/cost details, so you can explore what your model calls are doing.

The project is described as TypeScript-native and self-hostable. It’s built to help developers understand each trace (not just store telemetry) and to provide an end-to-end view of prompts, responses, token usage, and cost per traced call.

Key Features

  • Self-hostable tracing for AI agent calls: deploy it on platforms mentioned in the site (e.g., Railway, Fly, or your own servers) so tracing can run within your infrastructure.
  • TypeScript-native SDK: designed to fit TypeScript workflows and instrumentation patterns.
  • Trace prompts and completions: each traced request includes the actual prompt that was sent and the full response that came back.
  • Latency and cost visibility per trace: shows how long a call took and provides a per-trace breakdown of token usage and cost.
  • Low-friction setup: the site highlights “three lines of code,” with no config files or decorators and no long setup guide.
  • Automatic tracing with the Vercel AI SDK: the page states it works out of the box with generateText and streamText calls by adding the telemetry helper.

How to Use breadcrumb

  1. Install the SDK and initialize once in your TypeScript code.
  2. Create a breadcrumb client by calling init({ apiKey, baseUrl }).
  3. Initialize the AI SDK telemetry helper with initAiSdk(bc).
  4. Pass the telemetry helper into your LLM calls using the experimental_telemetry option shown in the example.

After you run your first generateText (or streamText) call, the trace should appear in the breadcrumb app (the site references a demo trace experience).

Example from the site (abridged for the core flow):

import { init } from "@breadcrumb-sdk/core";
import { initAiSdk } from "@breadcrumb-sdk/ai-sdk";

const bc = init({ apiKey, baseUrl });
const { telemetry } = initAiSdk(bc);

const { text } = await generateText({
  // ...
  experimental_telemetry: telemetry("summarize"),
});

Use Cases

  • Debugging unexpected model behavior in an agent: review the exact prompt sent and the returned completion for each traced request to understand where output changes come from.
  • Performance and latency monitoring: use the per-call timing information (how long each request took) to identify slower requests in a chain of operations.
  • Cost control and budget tracking: check token usage and cost breakdowns per trace to find which calls are consuming the most tokens before they impact invoices.
  • Observability for streaming vs non-streaming calls: instrument both generateText and streamText so you can trace the full lifecycle of requests made by an agent.
  • Team-based experimentation with self-hosting: run tracing on Railway, Fly, or your own servers and extend the open-source codebase as needed for your workflow.

FAQ

Is breadcrumb only a storage tool, or does it help me inspect traces?

breadcrumb is described as being built to “explore your traces, not just store them,” with visibility into prompt, completion, timing, and cost per request.

Does it work with the Vercel AI SDK?

Yes. The page states it works with the Vercel AI SDK out of the box, automatically tracing generateText and streamText calls when you pass the telemetry helper.

Do I need configuration files or decorators to start tracing?

The site claims the setup avoids config files and decorators and is intended to start with “three lines of code.”

Can I deploy it on my own infrastructure?

Yes. The page describes it as self-hostable and mentions deployment options including Railway, Fly, or your own servers.

What data does a trace include?

According to the page, each trace shows the prompt that was sent, the full response that came back, how long it took, and a breakdown of token usage and cost.

Alternatives

  • Open-source LLM observability/telemetry tools: there are other approaches to logging prompts, outputs, and token/cost data, typically used for debugging and monitoring. Differences are often in how they integrate with your framework (middleware/SDK hooks) and how the UI explores traces.
  • General APM/logging stacks (with custom LLM instrumentation): you can route LLM request/response metadata into tools like logging/metrics systems, but you may need to build more of the tracing and cost/token breakdown yourself.
  • Cloud-based tracing/analytics for AI apps: hosted platforms can reduce operational work, but they may trade off self-hosting and open-source customization depending on the provider’s model.
  • Other prompt/response inspection utilities: lightweight tools focused on capturing inputs/outputs can help debugging, though they may not provide the same per-trace token usage and cost breakdown described here.