This is the part 1 of 3-parts series on how to use ChatGPT with your own data.
Want ChatGPT to answer questions about your website and proprietary data?
This article will give you an overview of the architecture allowing ChatGPT to do so
ChatGPT limitations
Most people using ChatGPT or other LLM models like Claude 2 does not understand its limitations.
- LLM has a short-term memory problem.
The longest context window of GPT-4 API is roughly 3000 words. The recently released Claud 2 triple that to 10,000 words but performance suffers toward the end - LLM are stuck in the past.
Here is a simplified process of how current generation LLMs are trained- Companies prepare a large corpus of data
- They spend weeks or months training the LLM
- This is the reason why ChatGPT has a knowledge cut of on Sep 2019
Retrieval Augmented Generation
Retrieval Augmented Generation or RAG for short is the answer.
RAG allows LLM to retrieve relevant information based on user query. The extra information augments the context of the LLM.
RAG solve both the problem we talked above. Let's see how it works at a high level

Stage 1: Indexing pipeline for ingesting documents
This stage involves
- Prepare the data by splitting them into chunks. For example, splitting a pdf document into individual pages
- We need to turn individual chunk into a format LLM can understand . This is embeddings. We can use OpenAI embedding API for this. You can learn more about it here
- Store the output of the model in a vector database for future look up. There are multiple options for Vector database:
Stage 2: Search for relevant documents
When a user enter a query, we follow this process
- Use the same embedding model to turn the query into a vector
- Query the database for similar documents.
Stage 3: Generate response with new context
The result found in previous stage become the context for the LLM to generate the response from.
There you have it, now you understand how RAG works at a high level. In the next part, I'll show a few easy ways to get start with RAG
To learn more about RAG, check out these resources
About Trung Vu
Trung Vu, a former software engineer, founded Hoss in 2019 to enhance developer experiences, swiftly attracting Silicon Valley backers and a $1.6 million seed round. In 2021, his venture was acquired by Niantic Labs, of Pokemon Go fame, to bolster their Lightship platform.
Post-acquisition, Trung leads engineering teams at Niantic and invests in promising AI startups. An AI enthusiast even before ChatGPT's rise, he equates its potential to electricity. Through AI Growth Pad, his education platform, Trung teaches entrepreneurs to leverage AI for growth, embodying his commitment to ethical, transformative technology.