Thumbnail

How to use ChatGPT with your own data (Pt 1): Retrieval Augmented Generation

07/19/2023· 2 min read

This is the part 1 of 3-parts series on how to use ChatGPT with your own data.

Want ChatGPT to answer questions about your website and proprietary data?

This article will give you an overview of the architecture allowing ChatGPT to do so

ChatGPT limitations

Most people using ChatGPT or other LLM models like Claude 2 does not understand its limitations.

  • LLM has a short-term memory problem.
    The longest context window of GPT-4 API is roughly 3000 words. The recently released Claud 2 triple that to 10,000 words but performance suffers toward the end
  • LLM are stuck in the past.
    Here is a simplified process of how current generation LLMs are trained
    • Companies prepare a large corpus of data
    • They spend weeks or months training the LLM
    • This is the reason why ChatGPT has a knowledge cut of on Sep 2019

Retrieval Augmented Generation


Retrieval Augmented Generation
or RAG for short is the answer.

RAG allows LLM to retrieve relevant information based on user query. The extra information augments the context of the LLM.

RAG solve both the problem we talked above. Let's see how it works at a high level

Image


Stage 1: Indexing pipeline for ingesting documents


This stage involves

  • Prepare the data by splitting them into chunks. For example, splitting a pdf document into individual pages
  • We need to turn individual chunk into a format LLM can understand . This is embeddings. We can use OpenAI embedding API for this. You can learn more about it here
  • Store the output of the model in a vector database for future look up. There are multiple options for Vector database:

Stage 2: Search for relevant documents


When a user enter a query, we follow this process

  • Use the same embedding model to turn the query into a vector
  • Query the database for similar documents.

Stage 3: Generate response with new context

The result found in previous stage become the context for the LLM to generate the response from.

There you have it, now you understand how RAG works at a high level. In the next part, I'll show a few easy ways to get start with RAG

To learn more about RAG, check out these resources

Trung Vu

About Trung Vu

Trung Vu, a former software engineer, founded Hoss in 2019 to enhance developer experiences, swiftly attracting Silicon Valley backers and a $1.6 million seed round. In 2021, his venture was acquired by Niantic Labs, of Pokemon Go fame, to bolster their Lightship platform.

Post-acquisition, Trung leads engineering teams at Niantic and invests in promising AI startups. An AI enthusiast even before ChatGPT's rise, he equates its potential to electricity. Through AI Growth Pad, his education platform, Trung teaches entrepreneurs to leverage AI for growth, embodying his commitment to ethical, transformative technology.

AI Growth Pad Logo
Where entrepreneurs and marketing professionals learn best practices to leverage AI tools to 10x their productivity and business growth
© Copyright 2023, AI Growth Pad