Llm

Do we still need RAG if context windows hit 1M tokens? Long context windows have made the debate legitimate. The answer still depends on what problem is actually being solved. The problem every LLM has LLMs are trained on a snapshot of the world. Products built on top of them - ChatGPT, Claude, Perplexity - can search the web, but that is a tool injecting retrieved content into the context window, not the model learning anything new. The model itself cannot reliably access or recall information beyond its training data, especially when it is private, recent, or highly specific. Also nothing about internal documents, proprietary codebases, private wikis, or anything that was never in the training corpus to begin with. ...

RAG in Production, Part 2: The User-Facing Half - Cost, Feedback, Errors, and Test Gates

RAG vs Long Context Debate