RAG for long context LLMs
>> YOUR LINK HERE: ___ http://youtube.com/watch?v=SsHUNfhF32s
This is a talk that @rlancemartin gave at a few recent meetups on RAG in the era of long context LLMs. With context windows growing to 1M+ tokens, there have been many questions about whether RAG is dead. We pull together threads from a few different recent projects to take a stab at addressing this. We review some current limitations with long context LLM fact reasoning retrieval (using multi-needle in a haystack analysis), but also discuss some likely shifts in the RAG landscape due to expanding context windows (approaches for doc-centric indexing and RAG flow engineering ). • Slides: • https://docs.google.com/presentation/... • Highlighted references: • 1/ Multi-needle analysis w/ @GregKamradt • https://blog.langchain.dev/multi-need... • 2/ RAPTOR (@parthsarthi03 et al) • https://github.com/parthsarthi03/rapt... • • Building long context RAG with RAPTOR... • 3/ Dense-X / multi-representation indexing (@tomchen0 et al) • https://arxiv.org/pdf/2312.06648.pdf • https://blog.langchain.dev/semi-struc... • 4/ Long context embeddings (@JonSaadFalcon, @realDanFu, @simran_s_arora) • https://hazyresearch.stanford.edu/blo... • https://www.together.ai/blog/rag-tuto... • 5/ Self-RAG (@AkariAsai et al), C-RAG (Shi-Qi Yan et al) • https://arxiv.org/abs/2310.11511 • https://arxiv.org/abs/2401.15884 • https://blog.langchain.dev/agentic-ra... (edited) • Timepoints: • 0:20 - Context windows are getting longer • 2:10 - Multi-needle in a haystack • 9:30 - How might RAG change? • 12:00 - Query analysis • 13:07 - Document-centric indexing • 16:23 - Self-reflective RAG • 19:40 - Summary
#############################