| title | RAG (Retrieval-Augmented Generation) with the Agent component |
|---|---|
| sidebar_label | RAG |
| sidebar_position | 700 |
| description | Examples of how to use RAG with the Convex Agent component |
The Agent component has built-in capabilities to search message history with hybrid text & vector search. You can also use the RAG component to use other data to search for context.
Retrieval-Augmented Generation (RAG) is a technique that allows an LLM to search through custom knowledge bases to answer questions.
RAG combines the power of Large Language Models (LLMs) with knowledge retrieval. Instead of relying solely on the model's training data, RAG allows your AI to:
- Search through custom documents and knowledge bases
- Retrieve relevant context for answering questions
- Provide more accurate, up-to-date, and domain-specific responses
- Cite sources and explain what information was used
The RAG component is a Convex component that allows you to add data that you can search. It breaks up the data into chunks and generates embeddings to use for vector search. See the RAG component docs for details, but here are some key features:
- Namespaces: Use namespaces for user-specific or team-specific data to isolate search domains.
- Add Content: Add or replace text content by key.
- Semantic Search: Vector-based search using configurable embedding models
- Custom Filtering: Define filters on each document for efficient vector search.
- Chunk Context: Get surrounding chunks for better context.
- Importance Weighting: Weight content by providing a 0 to 1 "importance" to affect per-document vector search results.
- Chunking flexibility: Bring your own document chunking, or use the default.
- Graceful Migrations: Migrate content or whole namespaces without disruption.
This directory contains two different approaches to implementing RAG:
A straightforward implementation where the system automatically searches for relevant context for a user query.
- The message history will only include the original user prompt and the response, not the context.
- Looks up the context and injects it into the user's prompt.
- Works well if you know the user's question will always benefit from extra context.
For example code, see ragAsPrompt.ts for the overall code. The simplest version is:
const { thread } = await agent.continueThread(ctx, { threadId });
const context = await rag.search(ctx, {
namespace: "global",
query: userPrompt,
limit: 10,
});
const result = await thread.generateText({
prompt: `# Context:\n\n ${context.text}\n\n---\n\n# Question:\n\n"""${userPrompt}\n"""`,
});The LLM can intelligently decide when to search for context or add new information by providing a tool to search for context.
- The message history will include the original user prompt and message history.
- After a tool call and response, the message history will include the tool call and response for the LLM to reference.
- The LLM can decide when to search for context or add new information.
- This works well if you want the Agent to be able to dynamically search.
See ragAsTools.ts for the code. The simplest version is:
searchContext: createTool({
description: "Search for context related to this user prompt",
args: z.object({ query: z.string().describe("Describe the context you're looking for") }),
handler: async (ctx, { query }) => {
const context = await rag.search(ctx, { namespace: userId, query });
return context.text;
},
}),| Feature | Basic RAG | Tool-based RAG |
|---|---|---|
| Context Search | Always searches | AI decides when to search |
| Adding Context | Manual via separate function | AI can add context during conversation |
| Flexibility | Simple, predictable | Intelligent, adaptive |
| Use Case | FAQ systems, document search | Dynamic knowledge management |
| Predictability | Defined by code | AI may query too much or little |
To see these examples in action, check out the demo UI which provides:
- Context management UI
- Browse the chunks of documents in the RAG component
- Interactive chat interface
- Search result visualization
- Real-time streaming responses
Run the example with:
git clone https://github.com/get-convex/agent.git
cd agent
npm run setup
npm run example