๐งญ Langfuse Flashcards
Open-source LLM observability: traces, evals, prompts, and cost tracking
๐ What is Langfuse?
Langfuse is an LLM observability & analytics platform to log, debug, evaluate, and optimize AI app behavior in production.
๐งฑ Core Objects
Traces (end-to-end runs), Spans (steps), Generations (model calls), and Events (custom logs) capture full context.
๐ SDKs & Integrations
JavaScript/TypeScript & Python SDKs; works alongside LangChain, LlamaIndex, OpenAI/Anthropic clients, and custom stacks.
๐ Prompt Versioning
Store prompts with variables, version them, and compare iterations to see which variants perform best.
๐ Evaluations
Collect human feedback (thumbs, numeric, categorical) and run automated evals to score quality, relevance, safety, or accuracy.
๐ฐ Cost & Tokens
Track latency, token counts, and cost per request, user, route, or model to keep budgets under control.
๐งช Datasets & Experiments
Create datasets of inputs/outputs, run batch tests against models/prompts, and compare results over time.
๐๏ธ Trace Explorer
Filter and drill into traces by user, tag, route, time, or error; inspect prompts, responses, and intermediate steps.
๐ก๏ธ Privacy & Control
Supports redaction of sensitive fields, configurable retention, and self-hosted or managed deployment options.
๐๏ธ Sampling & Rate Limits
Sample high-volume traffic to reduce overhead; tag important runs for always-on logging.
๐ค Export & Webhooks
Export logs via API/CSV and trigger webhooks to pipe signals into data warehouses, alerting, or BI tools.
๐ Common Use Cases
Prod monitoring, A/B testing prompts, RAG quality tracking, agent step debugging, and SLA reporting.