Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate all the docs #15

Merged
merged 1 commit into from
Feb 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions docs/_workshop-java-quarkus.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ include::sections/01-intro.md[]

---

include::sections/02-preparation.md[]
include::sections/java-quarkus/02-preparation.md[]

---

include::sections/java-quarkus/02.1-additional-setup.md[]

---

include::sections/03-overview.md[]
include::sections/java-quarkus/03-overview.md[]

---

Expand All @@ -30,27 +30,27 @@ include::sections/java-quarkus/06-chat-api.md[]

---

include::sections/07-dockerfile.md[]
include::sections/java-quarkus/07-dockerfile.md[]

---

include::sections/08-website.md[]
include::sections/java-quarkus/08-website.md[]

---

include::sections/09-azure.md[]
include::sections/java-quarkus/09-azure.md[]

---

include::sections/10-deployment.md[]
include::sections/java-quarkus/10-deployment.md[]

---

include::sections/10.1-ci-cd.md[]
include::sections/java-quarkus/10.1-ci-cd.md[]

---

include::sections/11-improvements.md[]
include::sections/java-quarkus/11-improvements.md[]

---

Expand Down
61 changes: 61 additions & 0 deletions docs/sections/java-quarkus/02-preparation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## Preparation

Before diving into development, let's set up your project environment. This includes:

- Creating a new project on GitHub based on a template
- Using a prepared dev container environment on either [GitHub Codespaces](https://github.com/features/codespaces) or [VS Code with Dev Containers extension](https://aka.ms/vscode/ext/devcontainer) (or a manual install of the needed tools)

### Creating your project

1. Open [this GitHub repository](https://github.com/Azure-Samples/azure-openai-rag-workshop)
2. Click the **Fork** button and click on **Create fork** to create a copy of the project in your own GitHub account.

![Screenshot of GitHub showing the Fork button](./assets/fork-project.png)

Once the fork is created, select the **Code** button, then the **Codespaces** tab and click on **Create Codespaces on main**.

![Screenshot of GitHub showing the Codespaces creation](./assets/create-codespaces.png)

This will initialize a development container with all necessary tools pre-installed. Once it's ready, you have everything you need to start coding. Wait a few minutes after the UI is loaded to ensure everything is ready, as some tasks will be triggered after everything is fully loaded.

<div class="info" data-title="note">

> GitHub Codespaces provides up to 60 hours of free usage monthly for all GitHub users. You can check out [GitHub's pricing details](https://github.com/features/codespaces) for more information.

</div>

#### [optional] Local Development with the dev container

If you prefer working on your local machine, you can also run the dev container on your machine. If you're fine with using Codespaces, you can skip directly to the next section.


1. Ensure you have [Docker](https://www.docker.com/products/docker-desktop), [VS Code](https://code.visualstudio.com/), and the [Dev Containers extension](https://aka.ms/vscode/ext/devcontainer) installed.

<div class="tip" data-title="tip">

> You can learn more about Dev Containers in [this video series](https://learn.microsoft.com/shows/beginners-series-to-dev-containers/). You can also [check the website](https://containers.dev) and [the specification](https://github.com/devcontainers/spec).

</div>

2. In GitHub website, select the **Code** button, then the **Local** tab and copy your repository url.

![Screenshot of GitHub showing the repository URL](./assets/github-clone.png)
3. Clone your forked repository and then open the folder in VS Code:

```bash
git clone <your_repository_url>
```

3. In VS Code, use `Ctrl+Shift+P` (or `Command+Shift+P` on macOS) to open the **command palette** and type **Reopen in Container**.

![Reopen in container command in VS Code](./assets/vscode-reopen-in-container.png)

*Alt text: Screenshot of VS Code showing the "Reopen in Container" command.*

The first time it will take some time to download and setup the container image, meanwhile you can go ahead and read the next sections.

Once the container is ready, you will see "Dev Container: OpenAI Workshop" in the bottom left corner of VSCode:

![Dev Container status in VS Code](./assets/vscode-dev-container-status.png)


5 changes: 5 additions & 0 deletions docs/sections/java-quarkus/02.1-additional-setup.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
## Additional setup

To complete the template setup, please run the following command in a terminal, at the root of the project:

```bash
./scripts/setup-template.sh java-quarkus
```
76 changes: 76 additions & 0 deletions docs/sections/java-quarkus/03-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
## Overview of the project

The project template you've forked is a monorepo, which means it's a single repository that houses multiple projects. Here's how it's organized, focusing on the key files and directories:

```sh
.devcontainer/ # Configuration for the development container
data/ # Sample PDFs to serve as custom data
infra/ # Templates and scripts for Azure infrastructure
scripts/ # Utility scripts for document ingestion
src/ # Source code for the application's services
├── backend-java-quarkus/ # The Chat API developed with Quarkus
├── backend-nodejs/ # The Chat API developed with Node.js
├── frontend/ # The Chat website
├── indexer/ # Service for document ingestion
package.json # Configuration for NPM workspace
```

We're using Node.js for our APIs and website, and have set up an [NPM workspace](https://docs.npmjs.com/cli/using-npm/workspaces) to manage dependencies across all projects from a single place. Running `npm install` at the root installs dependencies for all projects, simplifying monorepo management.

For instance, `npm run <script_name> --workspaces` executes a script across all projects, while `npm run <script_name> --workspace=backend` targets just the backend.

Otherwise, you can use your regular `npm` commands in any project folder and it will work as usual.

### About the services

We generated the base code of our differents services with the respective CLI or generator of the frameworks we'll be using, and we've pre-written several service components so you can jump straight into the most interesting parts.

### The Chat API specification

Creating a chat-like experience requires two main components: a user interface and a service API. The [ChatBootAI OpenAPI specification](https://editor.swagger.io/?url=https://raw.githubusercontent.com/ChatBootAI/chatbootai-openapi/main/openapi/openapi-chatbootai.yml) standardizes their interactions. This standardization allows for the development of different client applications (like mobile apps) that can interact seamlessly with chat services written in various programming languages.

#### The Chat request

A chat request is sent in JSON format, and must contain at least the user's message. Other optional parameters include a flag indicating if the response should be streamed, context-specific options that can tailor the chat service's behavior and a session state object that can be used to maintain state between requests.

```json
{
"messages": [
{
"content": "Can I do some Scuba diving?",
"role": "user"
}
],
"stream": false,
"context": { ... },
"session_state": null
}
```


#### The chat response

The chat service responds with a JSON object representing the generated response. The answer is located under the message's `content` property.

```json
{
"choices": [
{
"index": 0,
"message": {
"content": "There is no information available about Scuba diving in the provided sources.",
"role": "assistant",
"context": { ... }
}
}
],
}
```

You can learn more about the [ChatBootAI OpenAPI specification here](https://editor.swagger.io/?url=https://raw.githubusercontent.com/ChatBootAI/chatbootai-openapi/main/openapi/openapi-chatbootai.yml) and on [the GitHub repo](https://github.com/ChatBootAI/chatbootai-openapi).

<div class="info" data-title="note">

> If streaming is enabled, the response will be a stream of JSON objects, each representing a chunk of the response. This format allows for a dynamic and real-time messaging experience, as each chunk can be sent and rendered as soon as it's ready. In that case, the response format follows the [Newline Delimited JSON (NDJSON)](https://github.com/ndjson/ndjson-spec) specification, which is a convenient way of sending structured data that may be processed one record at a time.

</div>
41 changes: 41 additions & 0 deletions docs/sections/java-quarkus/04-vector-db.md
Original file line number Diff line number Diff line change
@@ -1 +1,42 @@
## The vector database

We'll start by creating a vector database. Vectors are arrays of numbers that represent the features or characteristics of the data. For example, an image can be converted into a vector of pixels, or a word can be converted into a vector of semantic meaning. A vector database can perform fast and accurate searches based on the similarity or distance between the vectors, rather than exact matches. This enables applications such as image recognition, natural language processing, recommendation systems, and more.

### Ingestion and retrieval

In our use-case, text will be extracted out of PDF files, and this text will be *tokenized*. Tokenization is the process of splitting our text into different tokens, which will be short portions of text. Those tokens will then be converted into a *vector* and added to the database. The vector database is then able to search for similar vectors based on the distance between them.

That's how our system will be able to find the most relevant data, coming from the original PDF files.

This will be used in the first component (the *Retriever*) of the Retrieval Augmented Generation (RAG) pattern that we will use to build our custom ChatGPT.

### About vector databases

There are many available vector databases, and a good list can be found in the supported Vector stores list from the LangChain4j project: [https://github.com/langchain4j/langchain4j](https://github.com/langchain4j/langchain4j).

Some of the most popular ones are:

- [Chroma](https://js.langchain.com/docs/integrations/vectorstores/memory) which is an in-memory vector store, which is great for testing and development, but not for production.
- [Qdrant](https://qdrant.tech/)
- [pgvector](https://github.com/pgvector/pgvector)
- [Redis](https://redis.io)

### Introducing Azure AI Search

![Azure AI Search Logo](./assets/azure-ai-search-logo.png)

[Azure AI Search](https://azure.microsoft.com/products/ai-services/cognitive-search/) can be used as a vector database that can store, index, and query vector embeddings from a search index. You can use it to power similarity search, multi-modal search, recommendation systems, or applications implementing the RAG architecture.

It supports various data types, such as *text, images, audio, video,* and *graphs*, and can perform fast and accurate searches based on the similarity or distance between the vectors, rather than exact matches. It also offers an *hybrid search*, which combines semantic and vector search in the same query.

For this workshop, we'll use Azure AI Search as our vector database as it's easy to create and manage within Azure. For the RAG use-case, most vector databases will work in a similar way.

### Exploring Azure AI Search

By now, you should already have an Azure AI Search service created in your subscription, done by the `azd provision` command you ran in the setup process.

Open the [Azure Portal](https://portal.azure.com/), and search for the **AI Search** service in the top navigation bar.

You should see a service named `gptkb-<your_random_name>` in the list. This instance is currently empty, and we will create an index and populate it with data in the next section.

![Screenshot of Azure AI Search](./assets/azure-ai-search.png)
99 changes: 99 additions & 0 deletions docs/sections/java-quarkus/05-ingestion.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,101 @@
## Data ingestion

We are going to ingest the content of PDF documents in the vector database. We'll use a
tool located in the `src/indexer` folder of the project. This tool will extract the text from the PDF files, and send it to the vector database.

The code of this is already written for you, but let's have a look at how it works.

### The ingestion process

The `src/indexer/src/lib/indexer.ts` file contains the code that is used to ingest the data in the vector database. This runs inside a Node.js application, and deployed to Azure Container Apps.

PDFs files, which are stored in the `data` folder, will be sent to this Node.js application using the command line. The files provided here are for demo purpose only, and suggested prompts we'll use later in the workshop are based on those files.

<div class="tip" data-title="tip">

> You can replace the PDF files in the `data` folder with your own PDF files if you want to use your custom data! Keep in mind that the PDF files must be text-based, and not scanned images. Since the ingestion process can take some time, we recommend to start with a small number of files, with not too many pages.

</div>

#### Reading the PDF files content

The content the PDFs files will be used as part of the *Retriever* component of the RAG architecture, to generate answers to your questions using the GPT model.

Text from the PDF files is extracted in the `src/indexer/src/lib/document-processor.ts` file, using the [pdf.js library](https://mozilla.github.io/pdf.js/). You can have a look at code of the `extractTextFromPdf()` function if you're curious about how it works.

#### Computing the embeddings

After the text is extracted, it's then transformed into embeddings using the [OpenAI JavaScript library](https://github.com/openai/openai-node):

```ts
async createEmbedding(text: string): Promise<number[]> {
const embeddingsClient = await this.openai.getEmbeddings();
const result = await embeddingsClient.create({ input: text, model: this.embeddingModelName });
return result.data[0].embedding;
}
```

#### Adding the documents to the vector database

The embeddings along with the original texts are then added to the vector database using the [Qdrant JavaScript client library](https://www.npmjs.com/package/@qdrant/qdrant-js). This process is done in batches, to improve performance and limit the number of requests:

```ts
const points = sections.map((section) => ({
// ID must be either a 64-bit integer or a UUID
id: getUuid(section.id, 5),
vector: section.embedding!,
payload: {
id: section.id,
content: section.content,
category: section.category,
sourcepage: section.sourcepage,
sourcefile: section.sourcefile,
},
}));

await this.qdrantClient.upsert(indexName, { points });
```

### Running the ingestion process

Let's now execute this process. First, you need to make sure you have Qdrant and the indexer service running locally. We'll use Docker Compose to run both services at the same time. Run the following command in a terminal (**make sure you stopped the Qdrant container before!**):

```bash
docker compose up
```

This will start both Qdrant and the indexer service locally. This may takes a few minutes the first time, as Docker needs to download the images.

<div class="tip" data-title="tip">

> You can look at the `docker-compose.yml` file at the root of the project to see how the services are configured. Docker Compose automatically loads the `.env` file, so we can use the environment variables exposed there. To learn more about Docker Compose, check out the [official documentation](https://docs.docker.com/compose/).

</div>

Once all services are started, you can run the ingestion process by opening a new terminal and running the `./scripts/index-data.sh` script on Linux or macOS, or `./scripts/index-data.ps1` on Windows:

```bash
./scripts/index-data.sh
```

![Screenshot of the indexer CLI](./assets/indexer-cli.png)

Once this process is executed, a new collection will be available in your database, where you can see the documents that were ingested.

### Test the vector database

Open the Qdrant dashboard again by opening the following URL in your browser: [http://localhost:6333/dashboard](http://localhost:6333/dashboard).

<div class="tip" data-title="tip">

> In Codespaces, you need to select the **Ports** tab in the bottom panel, right click on the URL in the **Forwarded Address** column next to the `6333` port, and select **Open in browser**.

</div>

You should see the collection named `kbindex` in the list:

![Screenshot of the Qdrant dashboard](./assets/qdrant-dashboard.png)

You can select that collection and browse it. You should see the entries that were created by the ingestion process. Documents are split into multiple overlapping sections to improve the search results, so you should see multiple entries for each document.

Keep the services running, as we'll use them in the next section.
Loading