Langfuse Flashcards

๐Ÿงญ Langfuse Flashcards

Open-source LLM observability: traces, evals, prompts, and cost tracking

๐Ÿ”Ž What is Langfuse?

Langfuse is an LLM observability & analytics platform to log, debug, evaluate, and optimize AI app behavior in production.

๐Ÿงฑ Core Objects

Traces (end-to-end runs), Spans (steps), Generations (model calls), and Events (custom logs) capture full context.

๐Ÿ”Œ SDKs & Integrations

JavaScript/TypeScript & Python SDKs; works alongside LangChain, LlamaIndex, OpenAI/Anthropic clients, and custom stacks.

๐Ÿ“ Prompt Versioning

Store prompts with variables, version them, and compare iterations to see which variants perform best.

๐Ÿ“Š Evaluations

Collect human feedback (thumbs, numeric, categorical) and run automated evals to score quality, relevance, safety, or accuracy.

๐Ÿ’ฐ Cost & Tokens

Track latency, token counts, and cost per request, user, route, or model to keep budgets under control.

๐Ÿงช Datasets & Experiments

Create datasets of inputs/outputs, run batch tests against models/prompts, and compare results over time.

๐Ÿ—‚๏ธ Trace Explorer

Filter and drill into traces by user, tag, route, time, or error; inspect prompts, responses, and intermediate steps.

๐Ÿ›ก๏ธ Privacy & Control

Supports redaction of sensitive fields, configurable retention, and self-hosted or managed deployment options.

๐ŸŽ›๏ธ Sampling & Rate Limits

Sample high-volume traffic to reduce overhead; tag important runs for always-on logging.

๐Ÿ“ค Export & Webhooks

Export logs via API/CSV and trigger webhooks to pipe signals into data warehouses, alerting, or BI tools.

๐Ÿš€ Common Use Cases

Prod monitoring, A/B testing prompts, RAG quality tracking, agent step debugging, and SLA reporting.