You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-Data with `sv`field (ste_vec) - JSONB containment uses GIN indexes instead (see [GIN Indexes](#gin-indexes-for-jsonb-containment))
221
222
- Data without any index terms
222
223
223
224
### 2. Index Creation Timing
@@ -326,6 +327,89 @@ DROP INDEX IF EXISTS idx_users_encrypted_email;
326
327
327
328
---
328
329
330
+
## GIN Indexes for JSONB Containment
331
+
332
+
While B-tree indexes don't support `ste_vec` (JSONB containment), you can use PostgreSQL GIN indexes for efficient containment queries on encrypted JSONB columns.
333
+
334
+
### When to Use GIN Indexes
335
+
336
+
Use GIN indexes when:
337
+
- You need to perform JSONB containment queries (`@>`, `<@`)
338
+
- The table has a significant number of rows (500+ recommended)
339
+
- Query performance on containment operations is important
340
+
341
+
### Creating a GIN Index
342
+
343
+
Create a GIN index using the `jsonb_array()` function, which extracts the encrypted JSONB as a native `jsonb[]` array:
344
+
345
+
```sql
346
+
CREATEINDEXidx_encrypted_jsonb_gin
347
+
ON table_name USING GIN (eql_v2.jsonb_array(encrypted_column));
348
+
349
+
ANALYZE table_name;
350
+
```
351
+
352
+
**Important:** Always run `ANALYZE` after creating the index so PostgreSQL's query planner has accurate statistics.
353
+
354
+
### Query Patterns for GIN Indexes
355
+
356
+
There are two approaches to write containment queries that use GIN indexes:
357
+
358
+
#### Approach 1: Using jsonb_array() Function
359
+
360
+
Convert both sides to `jsonb[]` and use the native containment operator:
361
+
362
+
```sql
363
+
SELECT*FROM table_name
364
+
WHEREeql_v2.jsonb_array(encrypted_column) @>
365
+
eql_v2.jsonb_array($1::eql_v2_encrypted);
366
+
```
367
+
368
+
#### Approach 2: Using Helper Function
369
+
370
+
Use the convenience function which handles the conversion internally:
|`cast_as`| The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text`|
33
33
|`opts`| Index options | Optional for `match` indexes, required for `ste_vec` indexes (see below) |
34
+
|`migrating`| Skip auto-migration if true | Optional. Defaults to `false`. Set to `true` for batch operations |
34
35
35
36
#### Option (`cast_as`)
36
37
@@ -60,33 +61,33 @@ The default match index options are:
60
61
"tokenizer": {
61
62
"kind": "ngram",
62
63
"token_length": 3
63
-
}
64
-
"token_filters": {
65
-
"kind": "downcase"
66
-
}
64
+
},
65
+
"token_filters": [
66
+
{"kind": "downcase"}
67
+
]
67
68
}
68
69
```
69
70
70
-
-`tokenFilters`: a list of filters to apply to normalize tokens before indexing.
71
+
-`token_filters`: a list of filters to apply to normalize tokens before indexing.
71
72
-`tokenizer`: determines how input text is split into tokens.
72
-
-`m`: The size of the backing [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) in bits. Defaults to `2048`.
73
+
-`bf`: The size of the backing [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) in bits. Defaults to `2048`.
73
74
-`k`: The maximum number of bits set in the bloom filter per term. Defaults to `6`.
74
75
75
76
**Token filters**
76
77
77
-
There are currently only two token filters available: `downcase`and `upcase`. These are used to normalise the text before indexing and are also applied to query terms. An empty array can also be passed to `tokenFilters` if no normalisation of terms is required.
78
+
The `downcase`token filter is available to normalise text before indexing and is also applied to query terms. An empty array can also be passed to `token_filters` if no normalisation of terms is required.
78
79
79
80
**Tokenizer**
80
81
81
82
There are two `tokenizer`s provided: `standard` and `ngram`.
82
83
`standard` simply splits text into tokens using this regular expression: `/[ ,;:!]/`.
83
84
`ngram` splits the text into n-grams and accepts a configuration object that allows you to specify the `tokenLength`.
84
85
85
-
**m** and **k**
86
+
**bf** and **k**
86
87
87
-
`k` and `m` are optional fields for configuring [bloom filters](https://en.wikipedia.org/wiki/Bloom_filter) that back full text search.
88
+
`k` and `bf` are optional fields for configuring [bloom filters](https://en.wikipedia.org/wiki/Bloom_filter) that back full text search.
88
89
89
-
`m` is the size of the bloom filter in bits. `filterSize` must be a power of 2 between `32` and `65536` and defaults to `2048`.
90
+
`bf` is the size of the bloom filter in bits. It must be a power of 2 between `32` and `65536` and defaults to `2048`.
90
91
91
92
`k` is the number of hash functions to use per term.
92
93
This determines the maximum number of bits that will be set in the bloom filter per term.
@@ -103,7 +104,9 @@ Try to ensure that the string you search for is at least as long as the `tokenLe
103
104
104
105
#### Options for ste_vec indexes (`opts`)
105
106
106
-
An ste_vec index on a encrypted JSONB column enables the use of PostgreSQL's `@>` and `<@`[containment operators](https://www.postgresql.org/docs/16/functions-json.html#FUNCTIONS-JSONB-OP-TABLE).
107
+
An ste_vec index on an encrypted JSONB column enables the use of PostgreSQL's `@>` and `<@`[containment operators](https://www.postgresql.org/docs/16/functions-json.html#FUNCTIONS-JSONB-OP-TABLE).
108
+
109
+
> **Note:** The `@>` and `<@` operators work directly on `eql_v2_encrypted` types, allowing simple query syntax like `encrypted_col @> search_term`.
107
110
108
111
An ste_vec index requires one piece of configuration: the `prefix` (a string) which is passed as an info string to a MAC (Message Authenticated Code).
109
112
This ensures that all of the encrypted values are unique to that prefix.
@@ -204,7 +207,7 @@ A query prior to encrypting and indexing looks like a structurally similar subse
204
207
}
205
208
```
206
209
207
-
The expression `cs_ste_vec_v2(encrypted_account) @> cs_ste_vec_v2($query)` would match all records where the `encrypted_account` column contains a JSONB object with an "account" key containing an object with an "email" key where the value is the string "[email protected]".
210
+
The expression `encrypted_account @> $query` would match all records where the `encrypted_account` column contains a JSONB object with an "account" key containing an object with an "email" key where the value is the string "[email protected]".
208
211
209
212
When reduced to a prefix list, it would look like this:
210
213
@@ -224,9 +227,26 @@ When reduced to a prefix list, it would look like this:
224
227
225
228
Which is then turned into an ste_vec of hashes which can be directly queries against the index.
226
229
230
+
#### GIN indexing for ste_vec
231
+
232
+
For efficient containment queries on large tables, you can create a GIN index using the `eql_v2.jsonb_array()` function:
233
+
234
+
```sql
235
+
-- Create GIN index for containment queries
236
+
CREATEINDEXidx_encrypted_jsonbON mytable USING GIN (eql_v2.jsonb_array(encrypted_col));
237
+
238
+
-- Query using containment (will use the GIN index)
239
+
SELECT*FROM mytable WHERE encrypted_col @> $1::eql_v2_encrypted;
240
+
```
241
+
242
+
The following helper functions are available for GIN-indexed containment queries:
243
+
-`eql_v2.jsonb_array(val)` - Extracts encrypted JSONB as an array for GIN indexing
-- Extract array element by index (0-based, returns eql_v2_encrypted)
163
+
SELECT encrypted_array->0FROM examples;
143
164
```
144
165
166
+
**Note:** The `->` operator supports integer array indexing (e.g., `encrypted_array->0`), but the `->>` operator does not. Use `->` to access array elements by index.
167
+
145
168
### Array operations
146
169
147
170
EQL supports array operations on encrypted JSONB arrays:
@@ -200,6 +223,9 @@ GROUP BY eql_v2.jsonb_path_query_first(encrypted_json, 'color_selector');
- Extracts the selector hash from an encrypted value
245
271
272
+
### GIN-Indexable Functions
273
+
274
+
These functions enable efficient GIN-indexed containment queries. See [GIN Indexes for JSONB Containment](./database-indexes.md#gin-indexes-for-jsonb-containment) for index setup.
0 commit comments