Skip to content

Commit 0fc435f

Browse files
Merge branch 'main' into edgarrmondragon/chore/refactors-typos-cleanup
2 parents a2ff536 + 557c9da commit 0fc435f

10 files changed

+273
-71
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ repos:
1818
- id: trailing-whitespace
1919

2020
- repo: https://github.com/astral-sh/ruff-pre-commit
21-
rev: v0.1.14
21+
rev: v0.2.0
2222
hooks:
2323
- id: ruff
2424
args: [--fix]

README.md

Lines changed: 60 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,17 @@ pre-commit install
110110

111111
### Create and Run Tests
112112

113+
Set up the SSL files permissions:
114+
115+
```bash
116+
chmod 0600 .ssl/*.key
117+
```
118+
119+
Start the test databases using Docker Compose:
120+
```bash
121+
docker-compose up -d
122+
```
123+
113124
Create tests within the `target_postgres/tests` subfolder and
114125
then run:
115126

@@ -163,7 +174,7 @@ The below table shows how this tap will map between jsonschema datatypes and Pos
163174
| UNSUPPORTED | bit varying [ (n) ] |
164175
| boolean | boolean |
165176
| UNSUPPORTED | box |
166-
| UNSUPPORTED | bytea |
177+
| string with contentEncoding="base16" ([opt-in feature](#content-encoding-support)) | bytea |
167178
| UNSUPPORTED | character [ (n) ] |
168179
| UNSUPPORTED | character varying [ (n) ] |
169180
| UNSUPPORTED | cidr |
@@ -204,6 +215,7 @@ The below table shows how this tap will map between jsonschema datatypes and Pos
204215
Note that while object types are mapped directly to jsonb, array types are mapped to a jsonb array.
205216

206217
If a column has multiple jsonschema types, the following order is using to order Postgres types, from highest priority to lowest priority.
218+
- BYTEA
207219
- ARRAY(JSONB)
208220
- JSONB
209221
- TEXT
@@ -216,3 +228,50 @@ If a column has multiple jsonschema types, the following order is using to order
216228
- INTEGER
217229
- BOOLEAN
218230
- NOTYPE
231+
232+
## Content Encoding Support
233+
234+
Json Schema supports the [`contentEncoding` keyword](https://datatracker.ietf.org/doc/html/rfc4648#section-8), which can be used to specify the encoding of input string types.
235+
236+
This target can detect content encoding clues in the schema to determine how to store the data in the postgres in a more efficient way.
237+
238+
Content encoding interpretation is disabled by default. This is because the default config is meant to be as permissive as possible, and do not make any assumptions about the data that could lead to data loss.
239+
240+
However if you know your data respects the advertised content encoding way, you can enable this feature to get better performance and storage efficiency.
241+
242+
To enable it, set the `interpret_content_encoding` option to `True`.
243+
244+
### base16
245+
246+
The string is encoded using the base16 encoding, as defined in [RFC 4648](https://json-schema.org/draft/2020-12/draft-bhutton-json-schema-validation-00#rfc.section.8.3
247+
).
248+
249+
Example schema:
250+
```json
251+
{
252+
"type": "object",
253+
"properties": {
254+
"my_hex": {
255+
"type": "string",
256+
"contentEncoding": "base16"
257+
}
258+
}
259+
}
260+
```
261+
262+
Data will be stored as a `bytea` in the database.
263+
264+
Example data:
265+
```json
266+
# valid data
267+
{ "my_hex": "01AF" }
268+
{ "my_hex": "01af" }
269+
{ "my_hex": "1af" }
270+
{ "my_hex": "0x1234" }
271+
272+
# invalid data
273+
{ "my_hex": " 0x1234 " }
274+
{ "my_hex": "House" }
275+
```
276+
277+
For convenience, data prefixed with `0x` or containing an odd number of characters is supported although it's not part of the standard.

poetry.lock

Lines changed: 32 additions & 53 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ sqlalchemy = "~=2.0"
3737
sshtunnel = "0.4.0"
3838

3939
[tool.poetry.dependencies.singer-sdk]
40-
version = "~=0.34.0"
40+
version = "~=0.35.0"
4141

4242
[tool.poetry.group.dev.dependencies]
4343
pytest = ">=7.4.2"

0 commit comments

Comments
 (0)