Skip to content
    AI engineering roles via the DiamantAI Collective.See open roles
    Advanced RetrievalRAG Techniques

    Hierarchical Indices

    This code implements a Hierarchical Indexing system for document retrieval, utilizing two levels of encoding: document-level summaries and detailed chunks. This approach aims to improve the efficiency and relevance of information retrieval by first identifying relevant document sections through summaries, then drilling down to specific details within those sections.

    Traditional flat indexing methods can struggle with large documents or corpus, potentially missing context or returning irrelevant information. Hierarchical indexing addresses this by creating a two-tier search system, allowing for more efficient and context-aware retrieval.

    What you'll learn

    • 1
      PDF processing and text chunking
    • 2
      Asynchronous document summarization using OpenAI's GPT-4
    • 3
      Vector store creation for both summaries and detailed chunks using FAISS and OpenAI embeddings
    • 4
      Custom hierarchical retrieval function

    About this tutorial

    This hands-on Jupyter notebook is part of RAG Techniques, a free open-source repository by Nir Diamant covering rag techniques with runnable code examples and detailed explanations.

    Free and open-sourceRunnable Jupyter notebookActive community support
    Go deeper · Amazon Bestseller in Generative AI

    RAG Made Simple

    The book that extends this repo: 22 RAG techniques with the intuition behind each, side-by-side comparisons of when each wins (and quietly fails), and original illustrations.

    Get it on Amazon

    ⭐ 4.4 stars · 1,500+ readers · Kindle $9.99 · Paperback $24.99 · Free with Kindle Unlimited

    More Advanced Retrieval tutorials

    More from RAG Techniques