You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, let's download the dataset for our lab. We'll use four RAG-focused blogs from our Developer Center as the source data for our RAG application.
3
+
First, let's download the dataset for our lab. We'll use a subset of articles from the MongoDB Developer Center as the source data for our RAG application.
4
4
5
-
Run all the cells under the **Step 3: Load the dataset** section in the notebook to load the blog content as LangChain Document objects.
5
+
Run all the cells under the **Step 3: Load the dataset** section in the notebook to load the articles as a list of Python objects consisting of the content and relevant metadata.
Copy file name to clipboardExpand all lines: docs/50-prepare-the-data/3-chunk-data.mdx
+31-4Lines changed: 31 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
Since we are working with large documents, we first need to break them up into smaller chunks before embedding and storing them in MongoDB.
4
4
5
-
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 4: Chunk up the data** section in the notebook to chunk up the documents we loaded.
5
+
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 4: Chunk up the data** section in the notebook to chunk up the articles we loaded.
6
6
7
7
The answers for code blocks in this section are as follows:
8
8
@@ -13,7 +13,7 @@ The answers for code blocks in this section are as follows:
Copy file name to clipboardExpand all lines: docs/50-prepare-the-data/4-embed-data.mdx
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,11 @@
2
2
3
3
To perform vector search on our data, we need to embed it (i.e. generate embedding vectors) before ingesting it into MongoDB.
4
4
5
-
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 5: Generate embeddings** section in the notebook to generate embeddings for the chunked documents.
5
+
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 5: Generate embeddings** section in the notebook to embed the chunked articles.
6
6
7
7
The answers for code blocks in this section are as follows:
Copy file name to clipboardExpand all lines: docs/50-prepare-the-data/5-ingest-data.mdx
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,13 @@ import Screenshot from "@site/src/components/Screenshot";
2
2
3
3
# 👐 Ingest data into MongoDB
4
4
5
-
The final step to build a MongoDB vector store for our RAG application is to ingest the embedded documents into MongoDB.
5
+
The final step to build a MongoDB vector store for our RAG application is to ingest the embedded article chunks into MongoDB.
6
6
7
7
Fill in any `<CODE_BLOCK_N>` placeholders and run the cells under the **Step 6: Ingest data into MongoDB** section in the notebook to ingest the embedded documents into MongoDB.
8
8
9
9
The answers for code blocks in this section are as follows:
0 commit comments