Skip to content

Commit fb01fe3

Browse files
authored
Merge pull request #6461 from EnterpriseDB/release-2025-02-03a
Release 2025-02-03a
2 parents 30939a1 + 2bc9fdf commit fb01fe3

38 files changed

+688
-413
lines changed

advocacy_docs/edb-postgres-ai/ai-accelerator/installing/complete.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ __OUTPUT__
3434
List of installed extensions
3535
Name | Version | Schema | Description
3636
------------------+---------+------------+------------------------------------------------------------
37-
aidb | 1.0.7 | aidb | aidb: makes it easy to build AI applications with postgres
38-
pgfs | 1.0.4 | pgfs | pgfs: enables access to filesystem-like storage locations
37+
aidb | 2.1.1 | aidb | aidb: makes it easy to build AI applications with postgres
38+
pgfs | 1.0.6 | pgfs | pgfs: enables access to filesystem-like storage locations
3939
vector | 0.8.0 | public | vector data type and ivfflat and hnsw access methods
4040
```
4141

advocacy_docs/edb-postgres-ai/ai-accelerator/installing/index.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,3 @@ Pipelines is delivered as a set of extensions. Depending on how you are deployin
1212
- [Manually installing pipelines packages](packages)
1313

1414
Once the packages are installed, you can [complete the installation](complete) by activating the extensions within Postgres.
15-

advocacy_docs/edb-postgres-ai/ai-accelerator/limitations.mdx

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,14 @@ The impact of this depends on what type of embedding is being performed.
3232
### Data Formats
3333

3434
* Pipelines currently only supports Text and Image formats. Other formats, including structured data, video, and audio, are not currently supported.
35+
36+
### Upgrading
37+
38+
When upgrading the aidb and pgfs extension, there is currently no support for Postgres extension upgrades. You must therefor drop and recreate the extensions when upgrading to a new version of the extensions.
39+
40+
```sql
41+
DROP EXTENSION aidb CASCADE;
42+
DROP EXTENSION pgfs CASCADE;
43+
CREATE EXTENSION aidb CASCADE;
44+
CREATE EXTENSION pgfs CASCADE;
45+
```

advocacy_docs/edb-postgres-ai/ai-accelerator/models/openai-api-compatibility.mdx

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
22
title: "Using an OpenAI compatible API with Pipelines"
3-
navTitle: "OpenAI Compatible Models"
3+
navTitle: "OpenAI compatible Models"
44
description: "Using an OpenAI compatible API with Pipelines by setting options and credentials."
55
---
66

7-
To make use of an OpenAI compliant API, you can use the openai_embeddings or openai_completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.
7+
To make use of an OpenAI compliant API, you can use the embeddings or completions model providers. Note that a retriever will need to encode first so you can only use the embeddings model provider with a retriever.
88

99
## Why use an OpenAI compatible API?
1010

@@ -21,23 +21,23 @@ The starting point for this process is creating a model. When you create a model
2121
```sql
2222
select aidb.create_model(
2323
'my_local_ollama',
24-
'openai_embeddings',
25-
'{"model":"llama3.3", "url":"http://llama.local:11434/v1/embeddings", "dimensions":8192}'::JSONB,
24+
'embeddings',
25+
'{"model":"llama3.1", "url":"http://llama.local:11434/v1/embeddings", "dimensions":2000}'::JSONB,
2626
'{"api_key":""}'::JSONB);
2727
```
2828

2929
### Model name and model provider
3030

3131
The model name is the first parameter and set to “my_local_ollama” which we will use later.
3232

33-
We specify the model provider as “openai_embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.
33+
We specify the model provider as “embeddings” which is the provider that defaults to using OpenAI servers, but can be overridden by the configuration (the next parameter), to talk to any compliant server.
3434

3535
### Configuration
3636

3737
The next parameter is the configuration. This is a JSON string, which when expanded has three parameters, the model, the url and the dimensions.
3838

3939
```json
40-
'{"model":"llama3.3", "url":"http://llama.local:11434/v1/embeddings", "dimensions":8192}'::JSONB
40+
'{"model":"llama3.1", "url":"http://llama.local:11434/v1/embeddings", "dimensions":2000}'::JSONB
4141
```
4242

4343
In this case, we are setting the model to [“llama3.3”](https://ollama.com/library/llama3.3), a relatively new and powerful model. Remember to run `ollama run llama3.3` to pull and start the model on the server.
@@ -48,15 +48,15 @@ The next json setting is the important one, overriding the endpoint that the aid
4848
* It has port 11434 (the default port for Ollama) open to service requests over HTTP (not HTTPS in this case).
4949
* The path to the endpoint on the server `/v1/embeddings`; the same as OpenAI.
5050

51-
Putting those components together we get `[`http://llama.local:11434/v1/embeddings`](http://art.local:11434/v1/embeddings","api_key":"","dimensions":8192}'::JSONB)` as our end point.
51+
Putting those components together we get `[`http://llama.local:11434/v1/embeddings`](http://art.local:11434/v1/embeddings","api_key":"","dimensions":2000}'::JSONB)` as our end point.
5252

53-
The last JSON parameter in this example is “dimensions” which is a hint to the system about how many vector values to expect from the model. If we [look up llama3.3’s properties](https://ollama.com/library/llama3.3/blobs/4824460d29f2) we can see the `llama.embedding_length` value is 8192\. The provider defaults to 1536 (with some hard-wired exceptions depending on model) but it doesn’t know about llama3.3, so we have to pass the dimension value of 8192 in the configuration.
53+
The last JSON parameter in this example is “dimensions” which is a hint to the system about how many vector values to expect from the model. If we [look up llama3.3’s properties](https://ollama.com/library/llama3.3/blobs/4824460d29f2) we can see the `llama.embedding_length` value is 8192. The provider defaults to 1536 (with some hard-wired exceptions depending on model) but it doesn’t know about llama3.3's max. Another factor is [pgvector is limited to 2000 dimensions](https://github.com/pgvector/pgvector?tab=readme-ov-file#what-if-i-want-to-index-vectors-with-more-than-2000-dimensions). So we pass a dimension value of 2000 in the configuration, to get the maximum dimensions available with pgvector.
5454

5555
That completes the configuration parameter.
5656

5757
### Credentials
5858

59-
The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an api\_key, but the model provider currently requires that one is specified. We can specify an empty string for the api\_key to satisfy this requirement.
59+
The last parameter is the credentials parameter, which is another JSON string. It’s usually used for carrying the `api_key` for the OpenAI service and any other necessary credential information. It is not part of the configuration and by being separate, it can be securely hidden from users with lesser permissions. For our ollama connection, we don’t need an `api_key`, but the model provider currently requires that one is specified. We can specify an empty string for the `api_key` to satisfy this requirement.
6060

6161
## Using the model
6262

advocacy_docs/edb-postgres-ai/ai-accelerator/models/primitives.mdx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,17 @@ select aidb.decode_text_batch('my_bert_model', ARRAY[
6565
'summarize: The missile knows where it is at all times. It knows this because it knows where it isn''t. By subtracting where it is from where it isn''t, or where it isn''t from where it is (whichever is greater), it obtains a difference, or deviation. The guidance subsystem uses deviations to generate corrective commands to drive the missile from a position where it is to a position where it isn''t, and arriving at a position where it wasn''t, it now is.'
6666
]);
6767
```
68+
69+
## Rerank Text
70+
71+
Call aidb.rerank_text to get text reranking logits.
72+
73+
```sql
74+
SELECT aidb.rerank_text('my_reranking_model',
75+
'What is the best open source database?',
76+
ARRAY[
77+
'PostgreSQL',
78+
'The quick brown fox jumps over the lazy dog.',
79+
'Hercule Poirot'
80+
]);
81+
```
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
---
2+
title: "Completions"
3+
navTitle: "Completions"
4+
description: "Completions is a text completion model that enables use of any OpenAI API compatible text generation model."
5+
---
6+
7+
Model name: `completions`
8+
9+
Model aliases:
10+
11+
* `openai_completions`
12+
* `nim_completions`
13+
14+
## About Completions
15+
16+
Completions enables the use of any OpenAI API compatible text generation model.
17+
18+
It is suitable for chat/text transforms, text completion, and other text generation tasks.
19+
20+
Depending on the name of the model, the model provider will set defaults accordingly.
21+
22+
When invoked as `completions` or `openai_completions`, the model provider will default to using the OpenAI API.
23+
24+
When invoked as `nim_completions`, the model provider will default to using the NVIDIA NIM API.
25+
26+
27+
## Supported aidb operations
28+
29+
* decode_text
30+
* decode_text_batch
31+
32+
## Supported models
33+
34+
* Any text generation model that is supported by the provider.
35+
36+
## Supported OpenAI models
37+
38+
See a list of supported OpenAI models [here](https://platform.openai.com/docs/models#models-overview).
39+
40+
## Supported NIM models
41+
42+
* [ibm/granite-guardian-3.0-8b](https://build.nvidia.com/ibm/granite-guardian-3_0-8b)
43+
* [ibm/granite-3.0-8b-instruct](https://build.nvidia.com/ibm/granite-3_0-8b-instruct)
44+
* [ibm/granite-3.0-3b-a800m-instruct](https://build.nvidia.com/ibm/granite-3_0-3b-a800m-instruct)
45+
* [meta/llama-3.3-70b-instruct](https://build.nvidia.com/meta/llama-3_3-70b-instruct)
46+
* [meta/llama-3.2-3b-instruct](https://build.nvidia.com/meta/llama-3.2-3b-instruct)
47+
* [meta/llama-3.2-1b-instruct](https://build.nvidia.com/meta/llama-3.2-1b-instruct)
48+
* [meta/llama-3.1-405b-instruct](https://build.nvidia.com/meta/llama-3_1-405b-instruct)
49+
* [meta/llama-3.1-70b-instruct](https://build.nvidia.com/meta/llama-3_1-70b-instruct)
50+
* [meta/llama-3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct)
51+
* [meta/llama3-70b-instruct](https://build.nvidia.com/meta/llama3-70b)
52+
* [meta/llama3-8b-instruct](https://build.nvidia.com/meta/llama3-8b)
53+
* [nvidia/llama-3.1-nemotron-70b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct)
54+
* [nvidia/llama-3.1-nemotron-51b-instruct](https://build.nvidia.com/nvidia/llama-3_1-nemotron-51b-instruct)
55+
* [nvidia/nemotron-mini-4b-instruct](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct)
56+
* [nvidia/nemotron-4-340b-instruct](https://build.nvidia.com/nvidia/nemotron-4-340b-instruct)
57+
* [google/shieldgemma-9b](https://build.nvidia.com/google/shieldgemma-9b)
58+
* [google/gemma-7b](https://build.nvidia.com/google/gemma-7b)
59+
* [google/codegemma-7b](https://build.nvidia.com/google/codegemma-7b)
60+
61+
## Creating the default model
62+
63+
There is no default model for Completions. You can create any supported model using the `aidb.create_model` function.
64+
65+
## Creating an OpenAI model
66+
67+
You can create any supported OpenAI model using the `aidb.create_model` function.
68+
69+
In this example, we are creating a GPT-4o model with the name `my_openai_model`:
70+
71+
```sql
72+
SELECT aidb.create_model(
73+
'my_openai_model',
74+
'openai_completions',
75+
'{"model": "gpt-4o"}'::JSONB,
76+
'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
77+
);
78+
```
79+
80+
## Creating a NIM model
81+
82+
```sql
83+
SELECT aidb.create_model(
84+
'my_nim_completions',
85+
'nim_completions',
86+
'{"model": "meta/llama-3.2-1b-instruct"}'::JSONB,
87+
credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);
88+
```
89+
90+
## Model configuration settings
91+
92+
The following configuration settings are available for OpenAI models:
93+
94+
* `model` - The model to use.
95+
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.
96+
* If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`.
97+
* If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
98+
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.
99+
100+
## Model credentials
101+
102+
The following credentials are required for these models:
103+
104+
* `api_key` - The API key to use for authentication.

advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/openai-embeddings.mdx renamed to advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/embeddings.mdx

Lines changed: 29 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,25 @@
11
---
2-
title: "OpenAI Embeddings"
3-
navTitle: "OpenAI Embeddings"
4-
description: "OpenAI Embeddings is a text embedding model that enables use of any OpenAI text embedding model."
2+
title: "Embeddings"
3+
navTitle: "Embeddings"
4+
description: "Embeddings is a text embedding model that enables use of any OpenAI API compatible text embedding model."
55
---
66

7-
Model name: `openai_embeddings`
7+
Model name: `embeddings`
88

9-
## About OpenAI Embeddings
9+
Model aliases:
1010

11-
OpenAI Embeddings is a text embedding model that enables use of any supported OpenAI text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.
11+
* `openai_embeddings`
12+
* `nim_embeddings`
1213

13-
See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
14+
## About Embeddings
15+
16+
OpenAI Embeddings is a text embedding model that enables use of any OpenAI API complatible text embedding model. It is suitable for text classification, clustering, and other text embedding tasks.
17+
18+
Depending on the name of the model, the model provider will set defaults accordingly.
19+
20+
When invoked as `embeddings` or `openai_embeddings`, the model provider will default to using the OpenAI API.
21+
22+
When invoked as `nim_embeddings`, the model provider will default to using the NVIDIA NIM API.
1423

1524
## Supported aidb operations
1625

@@ -19,10 +28,18 @@ See a list of supported OpenAI models [here](https://platform.openai.com/docs/gu
1928

2029
## Supported models
2130

22-
* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`.
31+
* Any text embedding model that is supported by the provider.
32+
33+
### Supported OpenAI models
34+
35+
* Any text embedding model that is supported by OpenAI. This includes `text-embedding-3-small`, `text-embedding-3-large`, and `text-embedding-ada-002`. See a list of supported OpenAI models [here](https://platform.openai.com/docs/guides/embeddings#embedding-models).
2336
* Defaults to `text-embedding-3-small`.
2437

25-
## Creating the default model
38+
### Supported NIM models
39+
40+
* [nvidia/nv-embedqa-e5-v5](https://build.nvidia.com/nvidia/nv-embedqa-e5-v5) (default)
41+
42+
## Creating the default with OpenAI model
2643

2744
```sql
2845
SELECT aidb.create_model('my_openai_embeddings',
@@ -52,23 +69,11 @@ Because we are passing the configuration options and the credentials, unlike the
5269
The following configuration settings are available for OpenAI models:
5370

5471
* `model` - The OpenAI model to use.
55-
* `url` - The URL of the OpenAI model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://api.openai.com/v1/chat/completions`.
72+
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL.
73+
* If `openai_completions` (or `completions`) is the `model`, `url` defaults to `https://api.openai.com/v1/chat/completions`.
74+
* If `nim_completions` is the `model`, `url` defaults to `https://integrate.api.nvidia.com/v1/chat/completions`.
5675
* `max_concurrent_requests` - The maximum number of concurrent requests to make to the OpenAI model. Defaults to `25`.
5776

58-
## Available OpenAI Embeddings models
59-
60-
* sentence-transformers/all-MiniLM-L6-v2 (default)
61-
* sentence-transformers/all-MiniLM-L6-v1
62-
* sentence-transformers/all-MiniLM-L12-v1
63-
* sentence-transformers/msmarco-bert-base-dot-v5
64-
* sentence-transformers/multi-qa-MiniLM-L6-dot-v1
65-
* sentence-transformers/paraphrase-TinyBERT-L6-v2
66-
* sentence-transformers/all-distilroberta-v1
67-
* sentence-transformers/all-MiniLM-L6-v2
68-
* sentence-transformers/multi-qa-MiniLM-L6-cos-v1
69-
* sentence-transformers/paraphrase-multilingual-mpnet-base-v2
70-
* sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
71-
7277
## Model credentials
7378

7479
The following credentials are required for OpenAI models:

advocacy_docs/edb-postgres-ai/ai-accelerator/models/supported-models/index.mdx

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,11 @@ navigation:
1212

1313
This section provides details of the supported models in EDB Postgres AI - AI Accelerator - Pipelines and their capabilities.
1414

15-
* [T5](t5)
16-
* [OpenAI Embeddings](openai-embeddings)
17-
* [OpenAI Completions](openai-completions)
18-
* [BERT](bert)
19-
* [CLIP](clip)
15+
* [T5](t5).
16+
* [Embeddings](embeddings), including openai-embeddings and nim-embeddings.
17+
* [Completions](completions), including openai-completions and nim-completions.
18+
* [BERT](bert).
19+
* [CLIP](clip).
20+
* [NIM_CLIP](nim_clip).
21+
* [NIM_RERANKING](nim_reranking).
22+
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
---
2+
title: "CLIP"
3+
navTitle: "CLIP"
4+
description: "CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision."
5+
---
6+
7+
Model name: `nim_clip`
8+
9+
## About CLIP
10+
11+
CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks.
12+
13+
This specific model runs on NVIDIA NIM. More information about CLIP on NIM can be found [here](https://build.nvidia.com/nvidia/nvclip).
14+
15+
16+
## Supported aidb operations
17+
18+
* encode_text
19+
* encode_text_batch
20+
* encode_image
21+
* encode_image_batch
22+
23+
## Supported models
24+
25+
### NVIDIA NGC
26+
27+
* nvidia/nvclip (default)
28+
29+
30+
## Creating the default model
31+
32+
```sql
33+
SELECT aidb.create_model(
34+
'my_nim_clip_model',
35+
'nim_clip',
36+
credentials=>'{"api_key": "<API_KEY_HERE>"'::JSONB
37+
);
38+
```
39+
40+
There is only one model, the default `nvidia/nvclip`, so we do not need to specify the model in the configuration.
41+
42+
## Model configuration settings
43+
44+
The following configuration settings are available for CLIP models:
45+
46+
* `model` - The NIM model to use. The default is `nvidia/nvclip` and is the only model available.
47+
* `url` - The URL of the model to use. This is optional and can be used to specify a custom model URL. Defaults to `https://integrate.api.nvidia.com/v1/embeddings`.
48+
* `dimensions` - Model output vector size, defaults to 1024
49+
50+
## Model credentials
51+
52+
The following credentials are required if executing inside NVIDIA NGC:
53+
54+
* `api_key` - The NVIDIA Cloud API key to use for authentication.

0 commit comments

Comments
 (0)