LogoLogo
Home
  • Introduction
  • Quickstart
    • Starter guide
    • Core Concepts
      • Stack Type
      • Data Loader
      • Inputs/Outputs
      • Text Splitters
      • Embedding Model
      • Vector Store
      • Large Language Model
      • Memory
      • Chain
    • Testing Stack
    • Deployment
    • Knowledge Base
    • Organization and Teams
    • Secret Keys
    • Logs
  • Components
    • Inputs
    • Outputs
    • Document Loaders
    • Prompts
    • Text Splitters
    • Embeddings
    • Vector Store
    • Retrievers
    • Multi Modals
    • Agents
    • Large Language Models
    • Memories
    • Chains
    • Output Parsers
  • Customization
    • Writing Custom Components in GenAI Stack
    • Build your own custom component
    • Define parameters used for required components
  • Usecases
    • Simple QA using Open Source Large Language Models
    • Multilingual Indic Language Translation
    • Document Search and Chat
    • Chat with Multiple Documents
  • Terminologies
    • RAG - Retrieval Augmented Generation
    • Hybrid Search - Ensemble Retriever
  • REST APIs
    • GenAI Stack REST APIs
    • Chat API Reference
    • Text Generation API Reference
    • Rate Limiting and Sleep Mode
  • Troubleshooting
    • How to verify what is loaded and chunked from the loader?
  • Acknowledgements
    • Special Mentions
Powered by GitBook
On this page
  • What is RAG?
  • How RAG works?

Was this helpful?

  1. Terminologies

RAG - Retrieval Augmented Generation

PreviousChat with Multiple DocumentsNextHybrid Search - Ensemble Retriever

Last updated 1 year ago

Was this helpful?

What is RAG?

Retrieval Augmented Generation (RAG) is a natural language processing (NLP) technique that combines two fundamental tasks in NLP: information retrieval and text generation. It aims to enhance the generation process by incorporating information from external sources through retrieval. The goal of RAG is to produce more accurate and contextually relevant responses in text generation tasks.

In traditional text generation models like GPT-3, the model generates text based on patterns learned from a large corpus of data, but it may not always have access to specific, up-to-date, or contextually relevant information. Retrieval Augmented Generation addresses this limitation by introducing an information retrieval component.

How RAG works?

Retrieval: The model performs a retrieval step to gather relevant information from external sources. These sources could include a database, a knowledge base, a set of documents, or even search engine results. The retrieval process aims to find snippets or passages of text that contain information related to the given input or prompt.

Augmentation: The retrieved information is then combined with the original input or prompt, enriching the context available to the model for generating the output. By incorporating external knowledge, the model can produce more informed and accurate responses.

Generation: Finally, the model generates the response, taking into account the retrieved information and the original input. The presence of this additional context helps the model produce more contextually appropriate and relevant outputs.

RAG can be beneficial in various NLP tasks, such as Question-Answering, Dialogue generation, Summarization, and more. By incorporating external knowledge, RAG models have the potential to provide more accurate and informative responses compared to traditional generation models that rely solely on the data they were trained on.

Checkout full article on our Medium Page:

https://medium.aiplanet.com/retrieval-augmented-generation-using-qdrant-huggingface-embeddings-and-langchain-and-evaluate-the-3c7e3b1e4976