HyPE (Hypothetical Prompt Embedding)
This code implements a Retrieval-Augmented Generation (RAG) system enhanced by Hypothetical Prompt Embeddings (HyPE). Unlike traditional RAG pipelines that struggle with query-document style mismatch, HyPE precomputes hypothetical questions during the indexing phase. This transforms retrieval into a question-question matching problem, eliminating the need for expensive runtime query expansion techniques.
What you'll learn
- 1PDF processing and text extraction
- 2Text chunking to maintain coherent information units
- 3Hypothetical Prompt Embedding Generation using an LLM to create multiple proxy questions per chunk
- 4Vector store creation using FAISS and OpenAI embeddings
- 5Retriever setup for querying the processed documents
- 6Evaluation of the RAG system
About this tutorial
This hands-on Jupyter notebook is part of RAG Techniques, a free open-source repository by Nir Diamant covering rag techniques with runnable code examples and detailed explanations.
RAG Made Simple
The book that extends this repo: 22 RAG techniques with the intuition behind each, side-by-side comparisons of when each wins (and quietly fails), and original illustrations.
Get it on Amazon⭐ 4.4 stars · 1,500+ readers · Kindle $9.99 · Paperback $24.99 · Free with Kindle Unlimited
