Monitoring

RAG in Production, Part 2: The User-Facing Half - Cost, Feedback, Errors, and Test Gates

Part 2 of 2 - RAG is easy to measure. Harder to trust the measurements. Cost compounds quietly. Users don’t explain why they stopped asking questions. Errors without a taxonomy are just noise. These are the observability layers that most RAG dashboards skip. Picking up from Part 1 Part 1 covered the architecture, span tracing, and the four pipeline sections of the Vault dashboard: Performance, Retrieval Quality, Answer Quality, and Contextual Compression. ...

RAG in Production, Part 1: Why Observability Matters Before Anything Breaks

New to RAG? If you are relatively new to Retrieval-Augmented Generation and want to build a stronger foundation before diving in, start with this introduction to RAG concepts. Part 1 of 2 - RAG is easy to ship. Harder to trust! How silent degradation, invisible hallucination, and unclassified errors led me to instrument every layer of a production RAG pipeline - before the incidents taught me why. ...