Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal main #17

Merged
merged 55 commits into from
Mar 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
4ebc5b6
Patch code from internal to public main
actions-user Feb 19, 2025
346699f
Patch code from internal to public main
actions-user Feb 20, 2025
4fcfa79
Patch code from internal to public main
actions-user Feb 21, 2025
1ff113a
Patch code from internal to public main
actions-user Feb 25, 2025
c8ce919
Patch code from internal to public main
actions-user Feb 25, 2025
236cc2c
Patch code from internal to public main
actions-user Feb 26, 2025
0cf866b
Patch code from internal to public main
actions-user Feb 27, 2025
e7d1699
Patch code from internal to public main
actions-user Feb 27, 2025
9991d46
Patch code from internal to public main
actions-user Feb 27, 2025
a01e6f4
Patch code up to 19b83fc
actions-user Mar 4, 2025
b460972
Patch code up to 86a5603
actions-user Mar 4, 2025
e496398
Patch code up to 7989ff5
actions-user Mar 4, 2025
a9ecf35
Patch code up to 9b4e006
actions-user Mar 5, 2025
5473c38
Patch code up to fd63499
actions-user Mar 5, 2025
a7ce39e
Patch code up to de4dc67
actions-user Mar 5, 2025
22bc294
Fix some typos in the documentation (#35) (#51)
kkurzacz-intel Mar 5, 2025
aea49e8
Merge branch 'main' into internal_main
kkurzacz-intel Mar 5, 2025
70e0b01
Patch code up to b9591d0
actions-user Mar 5, 2025
b6a974c
Patch code up to d72db6f
actions-user Mar 5, 2025
caf6710
Patch code up to 699358f
actions-user Mar 5, 2025
f5ea345
Merge branch 'main' into internal_main
aalbersk Mar 6, 2025
61af02d
Patch code up to 41be526
actions-user Mar 6, 2025
9f56d33
Patch code up to 035f8fe
actions-user Mar 6, 2025
b346c07
Patch code up to 0729dd9
actions-user Mar 6, 2025
ba86828
Patch code up to 142e3f4
actions-user Mar 7, 2025
565dddd
Patch code up to 1e196cb
actions-user Mar 7, 2025
a11b5f6
Patch code up to bd146a8
actions-user Mar 7, 2025
4740679
Patch code up to 7966326
actions-user Mar 10, 2025
f3c549e
Patch code up to 9afcd4b
actions-user Mar 10, 2025
fda2277
Patch code up to 895906d
actions-user Mar 11, 2025
e759213
Patch code up to f983bc9
actions-user Mar 11, 2025
caec8af
Patch code up to 3f11c13
actions-user Mar 12, 2025
1d9daaa
Patch code up to b4d6695
actions-user Mar 14, 2025
72e0d30
Patch code up to d34aafd
actions-user Mar 14, 2025
eabb13f
Patch code up to 1c1ce41
actions-user Mar 17, 2025
2a26f1b
Patch code up to f66cd7d
actions-user Mar 17, 2025
ad16241
Patch code up to 7d8e343
actions-user Mar 17, 2025
e0a7743
Patch code up to 1bf3321
actions-user Mar 17, 2025
fe4edbd
Patch code up to 69a9fcf
actions-user Mar 17, 2025
6bd263c
Patch code up to 131b5bf
actions-user Mar 17, 2025
5aba8fa
Patch code up to 0d89558
actions-user Mar 17, 2025
8304962
Patch code up to fec4e48
actions-user Mar 17, 2025
6165acb
Patch code up to 720ce39
actions-user Mar 17, 2025
956252b
Patch code up to 5270ca1
actions-user Mar 18, 2025
b35becc
Patch code up to b9a447b
actions-user Mar 18, 2025
99f53bb
Patch code up to 2c2560f
actions-user Mar 18, 2025
d8fba06
Patch code up to 4c6a0f5
actions-user Mar 18, 2025
76deffc
Patch code up to e0ed3aa
actions-user Mar 18, 2025
9b98bc9
Patch code up to fd9e0b7
actions-user Mar 19, 2025
372a322
Patch code up to 89af502
actions-user Mar 19, 2025
7d0d7b8
Patch code up to a0b37ba
actions-user Mar 20, 2025
3ab8f28
Patch code up to 98f148d
actions-user Mar 21, 2025
3fcd5f1
Patch code up to ee0760f
actions-user Mar 21, 2025
57ce898
Patch code up to 5fe9fcd
actions-user Mar 21, 2025
30b3e5b
Patch code up to 84959f9
actions-user Mar 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,6 @@ super-linter-output/

# credentials
/deployment/default_credentials.txt

# .venv files
.venv/
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ For the complete microservices architecture, refer [here](./docs/microservices_a
| Operating System | Ubuntu 20.04/22.04 |
| Hardware Platforms | 4th Gen Intel® Xeon® Scalable processors<br>5th Gen Intel® Xeon® Scalable processors<br>6th Gen Intel® Xeon® Scalable processors<br>3rd Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator<br>4th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator <br>6th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 3 AI Accelerator|
| Kubernetes Version | 1.29.5 <br> 1.29.12 <br> 1.30.8 <br> 1.31.4 |
| Gaudi Firmware Version | 1.19.2
| Gaudi Firmware Version | 1.20.0

## Hardware Prerequisites for Deployment using Gaudi® AI Accelerator

Expand All @@ -65,11 +65,17 @@ If you don't have a Gaudi® AI Accelerator, you can request these instances in [

- visit [Intel® Tiber™ AI Cloud](https://console.cloud.intel.com/home).
- In the left pane select `Catalog > Hardware`.
- Select `Gaudi® 2 Deep Learning Server` or `Gaudi® 2 Deep Learning Server - Dell`.
- Select the Machine image - for example: `ubuntu-2204-gaudi2-1.17.0-vm-v4` with `Architecture: X86_64 (Baremetal only)`. Please note that minor version tags may change over time.
- Select `Gaudi® 2 Deep Learning Server` (recommended). `Gaudi® 2 Deep Learning VM` is also available but due to its resource limitation it is not recommended.
- Select Instance Type - for best performance we recommend choosing a Bare Metal machine with 8 Gaudi devices.
- Select the Machine image - for example: `ubuntu-2204-gaudi2-1.19.1-*` with `Architecture: X86_64 (Baremetal only)`. Please note that minor version tags may change over time.
- Upload your public key and launch the instance
- Navigate to the `Instances` page and verify that the machine has reached its ready state, then click on "How to Connect via SSH" to configure your machine correctly for further installation.

> [!NOTE]
> If you don't see any of the options above, you can either
> - request the access to the instances by selecting `Preview > Preview Catalog`
> - chat with Intel® Tiber™ AI Cloud agent by selecting question mark button in top right corner.

## Hardware Prerequisites for Deployment using Xeon only
To deploy the solution on a platform using 4th or 5th generation Intel® Xeon® processors, you will need:
- access to any platform with Intel® Xeon® Scalable processors that meet bellow requirements:
Expand Down
89 changes: 75 additions & 14 deletions deployment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,14 @@ This document details the deployment of Intel® AI for Enterprise RAG. By defaul
6. [Verify Services](#verify-services)
7. [Available Pipelines](#available-pipelines)
8. [Interact with ChatQnA](#interact-with-chatqna)
1. [Test Deployment](#test-deployment)
2. [Access UI/Grafana](#access-the-uigrafana)
9. [Configure ChatQnA](#configure-chatqna)
10. [Clear Deployment](#clear-deployment)
11. [Additional features](#additional-features)
1. [Enabling Pod Security Admission (PSA)](#enabling-pod-security-admission-psa)
2. [Running Enterprise RAG with Intel® Trust Domain Extensions (Intel® TDX)](#running-enterprise-rag-with-intel-trust-domain-extensions-intel-tdx)
3. [Single Sign On Integration using Microsoft Entra ID (formerly Azure Active Directory)](#single-sign-on-integration-using-microsoft-entra-id-formerly-azure-active-directory)
---

## Verify System Status
Expand Down Expand Up @@ -109,9 +112,11 @@ The default resource allocations are defined for Xeon only deployment in [`resou
> [!NOTE]
It is possible to reduce the resources allocated to the model server if you encounter issues with node capacity, but this will likely result in a performance drop. Recommended Hardware parameters to run RAG pipeline are available [here](../README.md#hardware-prerequisites-for-deployment-using-xeon-only).

For Enhanced Dataprep Pipeline (EDP) configuration, please refer to a separate helm chart located in `deployment/edp/helm` folder. It does not have a separate `resources*.yaml` definition. To change resources before deployment, locate the [`values.yaml`](./edp/helm/values.yaml) file and edit definition for particular elements from that deployment.

### Skipping Warm-up for vLLM Deployment
The `VLLM_SKIP_WARMUP` environment variable controls whether the model warm-up phase is skipped during initialization. To modify this setting, update the deployment configuration in:
- For vLLM running on Gaudi: [vllm/docker/.env.hpu](./../src/comps/llms/impl/model_server/vllm/docker/.env.hpu)
- For vLLM running on Gaudi: [vllm/docker/.env.hpu](./../src/comps/llms/impl/model_server/vllm/docker/.env.hpu)
- For vLLM running on CPU: [vllm/docker/.env.cpu](./../src/comps/llms/impl/model_server/vllm/docker/.env.cpu)

> [!NOTE]
Expand Down Expand Up @@ -278,13 +283,12 @@ chatqa torchserve-embedding-svc-deployment-54d498dd6f-btg2l 1/1
chatqa torchserve-embedding-svc-deployment-54d498dd6f-hwfz4 1/1 Running 0 21m
chatqa torchserve-embedding-svc-deployment-54d498dd6f-jqcfh 1/1 Running 0 21m
chatqa vllm-service-m-deployment-6d86b69fb-6xxr2 1/1 Running 0 21m
dataprep dataprep-svc-deployment-6c745cfb56-qphf2 1/1 Running 0 14m
dataprep embedding-svc-deployment-66fc547b67-fc7z2 1/1 Running 0 14m
dataprep ingestion-svc-deployment-8f96f77d-2526q 1/1 Running 0 14m
dataprep router-service-deployment-6f46d49c7d-2smtb 1/1 Running 0 14m
edp edp-backend-559948896d-f9xkq 1/1 Running 0 13m
edp edp-celery-7b999df6fb-p7j84 1/1 Running 1 (7m4s ago) 13m
edp edp-dataprep-76b895d445-wh629 1/1 Running 0 13m
edp edp-embedding-844f9c9c97-tq49m 1/1 Running 0 13m
edp edp-flower-554594dd4d-6z666 1/1 Running 0 13m
edp edp-ingestion-bc559885f-s7qsp 1/1 Running 0 13m
edp edp-minio-5948fbc87f-6d8lq 1/1 Running 0 13m
edp edp-minio-provisioning-7rx98 0/1 Completed 0 12m
edp edp-postgresql-0 1/1 Running 0 13m
Expand Down Expand Up @@ -365,22 +369,28 @@ data: [DONE]
Test finished succesfully
```

### Access the UI
### Access the UI/Grafana

To access the cluster, please update the `/etc/hosts` file on your machine to match the domain name with the externally
exposed IP address of the cluster.
To access the UI, do the following:
1. Forward the port from the ingress pod.
```bash
sudo -E kubectl port-forward --namespace ingress-nginx svc/ingress-nginx-controller 443:https
```
2. If you'd like to access the UI from another machine, tunel the port from the host:
```bash
ssh -L 443:localhost:443 user@ip
```
3. Update `/etc/hosts` file on the machine where you'd like to access the UI to match the domain name with the externally exposed IP address of the cluster. On a Windows machine, this file is typically located at `C:\Windows\System32\drivers\etc\hosts`.

For example, the updated file content should resemble the following:
For example, the updated file content should resemble the following:

```
<Ingress external IP> erag.com grafana.erag.com auth.erag.com s3.erag.com minio.erag.com
```bash
127.0.0.1 erag.com grafana.erag.com auth.erag.com s3.erag.com minio.erag.com
```

> [!NOTE]
> This is the IPv4 address of local machine.

On a Windows machine, this file is typically located at `C:\Windows\System32\drivers\etc\hosts`.

Once the update is complete, you can access the Enterprise RAG UI by typing the following URL in your web browser:
`https://erag.com`

Expand All @@ -396,12 +406,15 @@ MinIO Console can be accessed via:
S3 API is exposed at:
`https://s3.erag.com`

> [!CAUTION]
> Before ingesting the data, access the `https://s3.erag.com` to agree to accessing the self-signed certificate.

### UI credentials for the first login

Once deployment is complete, there will be file `default_credentials.txt` created in `deployment` folder with one time passowrds for application admin and user. After one time password will be provided you will be requested to change the default password.

> [!CAUTION]
> Please remove file `default_credentials.txt` after the first succesfull login.
> Please remove file `default_credentials.txt` after the first succesful login.

### Credentials for Grafana and Keycloak

Expand Down Expand Up @@ -463,3 +476,51 @@ For deploying ChatQnA components with Intel® Trust Domain Extensions (Intel® T

> [!NOTE]
> Intel TDX feature in Enterprise RAG is experimental.

### Single Sign On Integration using Microsoft Entra ID (formerly Azure Active Directory)

#### Prerequisites

1. Configured and working Microsoft Entra ID
- preconfigured and working SSO for other applications
- two new groups - one for `erag-admins`, one for `erag-users` - save `Object ID` for those entitites
- defined some user accounts that can be later added to either `erag-admins` or `erag-users` groups
2. Registered a new Azure `App registration`
- configured with Redirect URI `https://auth.erag.com/realms/EnterpriseRAG/broker/oidc/endpoint`
- in App registration -> Overview - save the `Application (client) ID` value
- in App registration -> Overview -> Endpoints - save `OpenID Connect metadata document` value
- in App registration -> Manage -> Cerficiates & secrets -> New client secret - create and save `Client secret` value
3. Add users to newly created groups, either `erag-admins` or `erag-users` in Microsoft Entra ID

#### Keycloak configuration

To configure Enterprise RAG SSO using Azure Single Sign On use the following steps:

1. Log in as `admin` user into Keycloak and select `EnterpriseRAG` realm.
2. Choose `Identity providers` from the left menu.
3. Add a new `OpenID Connect Identity Provider` and configure:
- Field `Alias` - enter your SSO alias, for example `enterprise-sso`
- Field `Display name` - enter your link display name to redirect to external SSO, for example `Enterprise SSO`
- Field `Discovery endpoint` - enter your `OpenID Connect metadata document`. Configuration fields should autopopulate
4. Choose `Groups` in left menu. Then create the following groups:
1. `erag-admin-group` should consist of following groups from keycloak:
- `(EnterpriseRAG-oidc) ERAG-admin`
- `(EnterpriseRAG-oidc-backend) ERAG-admin`
- `(EnterpriseRAG-oidc-minio) consoleAdmin` # if using internal MinIO
2. `erag-user-group` should consist of following groups from keycloak:
- `(EnterpriseRAG-oidc) ERAG-user`
- `(EnterpriseRAG-oidc-backend) ERAG-user`
- `(EnterpriseRAG-oidc-minio) readonly` # if using internal MinIO
5. Configure two `Identity mappers` in `Mappers` under created `Identity provider`
1. Add Identity Provider Mapper - for group `erag-admin-group`
- Field `Name` - this is the `Object ID` from `erag-admins` from Microsoft Entra ID
- Field `Mapper type` - enter `Hardcoded Group`
- Field `Group` - select `erag-admin-group`
2. Add Identity Provider Mapper - for group `erag-user-group`
- Field `Name` - this is the `Object ID` from `erag-users` from Microsoft Entra ID
- Field `Mapper type` - enter `Hardcoded Group`
- Field `Group` - select `erag-user-group`

After this configuration, Keycloak log-in page should have an additional link on the bottom of the log-in form - named `Enterprise SSO`. This should redirect you to Azure log-in page.

Depending on users' group membership in Microsoft Entra ID (either `erag-admins` or `erag-users`) users will have apropriate permissions mapped. For example, `erag-admins` will have access to the admin panel.
16 changes: 6 additions & 10 deletions deployment/auth/apisix-routes/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,12 @@ endpoints:
- /api/file/$1/retry
- ^/api/v1/edp/file/([0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12})/task$
- /api/file/$1/task
- ^/api/v1/edp/list_buckets$
- /api/list_buckets
- ^/api/v1/edp/file/([0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12})/extract$
- /api/file/$1/extract
- ^/api/v1/edp/retrieve$
- /api/retrieve
- ^/api/v1/edp/presignedUrl$
- /api/presignedUrl
permissions:
Expand All @@ -77,16 +83,6 @@ endpoints_api:
service_path: /apis/gmc.opea.io/v1alpha3/namespaces/chatqa/gmconnectors/chatqa/status
permissions:
- "admin#admin-access"
- name: k8s-api-watcher-dataprep
namespace: default
path: /api/v1/dataprep/status
backend_service: kubernetes
rate_limit_count: 30
service_port: 443
scheme: https
service_path: /apis/gmc.opea.io/v1alpha3/namespaces/dataprep/gmconnectors/dataprep/status
permissions:
- "admin#admin-access"

upstreams:
- name: kubernetes
Expand Down
3 changes: 2 additions & 1 deletion deployment/client_test/client-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ spec:
containers:
- image: curlimages/curl
name: curl
command: ["sleep", "infinity"]
command: ["/bin/sh", "-c"]
args: ["trap exit TERM; sleep 300 & wait"]
securityContext:
capabilities:
drop:
Expand Down
87 changes: 87 additions & 0 deletions deployment/edp/helm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,30 @@ Expand the name of the chart.
{{- default .Chart.Name .Values.flower.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{- define "helm-edp.dataprep.name" -}}
{{- default .Chart.Name .Values.dataprep.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{- define "helm-edp.dpguard.name" -}}
{{- default .Chart.Name .Values.dpguard.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{- define "helm-edp.embedding.name" -}}
{{- default .Chart.Name .Values.embedding.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{- define "helm-edp.ingestion.name" -}}
{{- default .Chart.Name .Values.ingestion.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{- define "helm-edp.noProxyWithContainers" -}}
{{- printf "%s,edp-backend,edp-celery,edp-dataprep,edp-dpguard,edp-embedding,edp-flower,edp-ingestion,edp-minio,edp-postgresql-0,edp-redis-master-0" .Values.proxy.noProxy }}
{{- end }}

{{- define "helm-edp.awsSqs.name" -}}
{{- default .Chart.Name .Values.awsSqs.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
Expand Down Expand Up @@ -95,6 +119,47 @@ app.kubernetes.io/name: {{ include "helm-edp.backend.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
dpguard labels
*/}}
{{- define "helm-edp.dpguard.selectorLabels" -}}
app.kubernetes.io/name: {{ include "helm-edp.dpguard.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
embedding labels
*/}}
{{- define "helm-edp.embedding.selectorLabels" -}}
app.kubernetes.io/name: {{ include "helm-edp.embedding.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
ingestion labels
*/}}
{{- define "helm-edp.ingestion.selectorLabels" -}}
app.kubernetes.io/name: {{ include "helm-edp.ingestion.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
dataprep labels
*/}}
{{- define "helm-edp.dataprep.selectorLabels" -}}
app.kubernetes.io/name: {{ include "helm-edp.dataprep.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
awsSqs labels
*/}}
{{- define "helm-edp.awsSqs.selectorLabels" -}}
app.kubernetes.io/name: {{ include "helm-edp.awsSqs.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}


{{/*
Create the name of the service account to use
*/}}
Expand All @@ -105,3 +170,25 @@ Create the name of the service account to use
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}

{{- /*
Retrieves resource values based on the provided filename and values.
*/ -}}
{{- define "manifest.getResource" -}}
{{- $filename := index . 0 -}}
{{- $defaultValues := fromYaml (index . 1) -}}
{{- $values := index . 2 -}}

{{- if and ($values.services) (index $values "services" $filename) (index $values "services" $filename "resources") }}
{{- $defaultValues = index $values "services" $filename "resources" }}
{{- end -}}

{{- $isTDXEnabled := hasKey $values "tdx" -}}
{{- $isGaudiService := regexMatch "(?i)gaudi" $filename -}}

{{- if and $isTDXEnabled (not $isGaudiService) }}
{{- include "manifest.tdx.getResourceValues" (dict "defaultValues" $defaultValues "filename" $filename "values" $values) }}
{{- else }}
{{- $defaultValues | toYaml }}
{{- end -}}
{{- end -}}
13 changes: 13 additions & 0 deletions deployment/edp/helm/templates/awsSqs/aws-access-secrets.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{{- if .Values.awsSqs.enabled -}}
apiVersion: v1
kind: Secret
metadata:
name: edp-aws-access-secrets
type: Opaque
stringData:
AWS_DEFAULT_REGION: "us-west-2"
AWS_ACCESS_KEY_ID: {{ .Values.edpAccessKey | quote }}
AWS_SECRET_ACCESS_KEY: {{ .Values.edpSecretKey | quote }}
AWS_SQS_EVENT_QUEUE_URL: {{ .Values.edpSqsEventQueueUrl | quote }}
EDP_BACKEND_ENDPOINT: {{ "http://edp-backend:5000/minio_event" }}
{{- end }}
Loading
Loading