RAPTOR
RAPTOR is an advanced information retrieval and question-answering system that combines hierarchical document summarization, embedding-based retrieval, and contextual answer generation. It aims to efficiently handle large document collections by creating a multi-level tree of summaries, allowing for both broad and detailed information retrieval.
Traditional retrieval systems often struggle with large document sets, either missing important details or getting overwhelmed by irrelevant information. RAPTOR addresses this by creating a hierarchical structure of the document collection, allowing it to navigate between high-level concepts and specific details as needed.
What you'll learn
- 1Tree Building: Creates a hierarchical structure of document summaries.
- 2Embedding and Clustering: Organizes documents and summaries based on semantic similarity.
- 3Vectorstore: Efficiently stores and retrieves document and summary embeddings.
- 4Contextual Retriever: Selects the most relevant information for a given query.
- 5Answer Generation: Produces coherent responses based on retrieved information.
About this tutorial
This hands-on Jupyter notebook is part of RAG Techniques, a free open-source repository by Nir Diamant covering rag techniques with runnable code examples and detailed explanations.
RAG Made Simple
The book that extends this repo: 22 RAG techniques with the intuition behind each, side-by-side comparisons of when each wins (and quietly fails), and original illustrations.
Get it on Amazon⭐ 4.4 stars · 1,500+ readers · Kindle $9.99 · Paperback $24.99 · Free with Kindle Unlimited
