Skip to content

Document USE_LOGIC_TYPE format option for PARQUET and AVRO files #2490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion docs/cn/sql-reference/00-sql-reference/50-file-format-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ formatTypeOptions ::=
ESCAPE = '<character>'
NAN_DISPLAY = '<string>'
ROW_TAG = '<string>'
USE_LOGIC_TYPE = TRUE | FALSE
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE
```

Expand Down Expand Up @@ -229,6 +230,15 @@ Databend 在处理 TSV 文件时受以下条件约束:
| `ERROR` (默认) | 如果遇到缺失字段,则生成错误。 |
| `FIELD_DEFAULT` | 对缺失字段使用字段的默认值。 |

### USE_LOGIC_TYPE (仅加载)

控制加载期间如何解释时间数据类型(日期和时间戳)。

| 可用值 | 描述 |
|-----------------|---------------------------------------------------------------------------------------------------------------------|
| `TRUE` (默认) | 日期和时间戳值作为其逻辑数据类型(DATE 和 TIMESTAMP)加载。 |
| `FALSE` | 日期和时间戳值作为原始整数值加载(日期为 INT32,时间戳为 INT64)。 |

### COMPRESSION (仅卸载)

指定压缩算法,该算法用于压缩文件的内部块,而不是整个文件,因此输出仍为 Parquet 格式。
Expand Down Expand Up @@ -258,4 +268,13 @@ Databend 在处理 TSV 文件时受以下条件约束:
| 可选值 | 描述 |
|------------------|----------------------------------------------------------------------------------------------------|
| `ERROR` (默认) | 如果遇到缺失字段,则会生成错误。 |
| `FIELD_DEFAULT` | 对于缺失的字段,使用该字段的默认值。 |
| `FIELD_DEFAULT` | 对于缺失的字段,使用该字段的默认值。 |

### USE_LOGIC_TYPE (仅加载)

控制加载期间如何解释时间数据类型(日期和时间戳)。

| 可用值 | 描述 |
|-----------------|---------------------------------------------------------------------------------------------------------------------|
| `TRUE` (默认) | 日期和时间戳值作为其逻辑数据类型(DATE 和 TIMESTAMP)加载。 |
| `FALSE` | 日期和时间戳值作为原始整数值加载(日期为 INT32,时间戳为 INT64)。 |
21 changes: 20 additions & 1 deletion docs/en/sql-reference/00-sql-reference/50-file-format-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ formatTypeOptions ::=
ESCAPE = '<character>'
NAN_DISPLAY = '<string>'
ROW_TAG = '<string>'
USE_LOGIC_TYPE = TRUE | FALSE
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | XZ | NONE
```

Expand Down Expand Up @@ -231,6 +232,15 @@ Determines the behavior when encountering missing fields during data loading. Re
| `ERROR` (Default)| Generates an error if a missing field is encountered. |
| `FIELD_DEFAULT` | Uses the default value of the field for missing fields. |

### USE_LOGIC_TYPE (Load Only)

Controls how temporal data types (date and timestamp) are interpreted during loading.

| Available Values | Description |
|------------------|----------------------------------------------------------------------------------------------------------------------------|
| `TRUE` (Default) | Date and timestamp values are loaded as their logical data types (DATE and TIMESTAMP). |
| `FALSE` | Date and timestamp values are loaded as raw integer values (INT32 for dates, INT64 for timestamps). |

### COMPRESSION (Unload Only)

Specifies the compression algorithm, which is used for compressing internal blocks of the file rather than the entire file, so the output remains in Parquet format.
Expand Down Expand Up @@ -262,4 +272,13 @@ Determines the behavior when encountering missing fields during data loading. Re
| Available Values | Description |
|------------------|-----------------------------------------------------------------------------------------------|
| `ERROR` (Default)| Generates an error if a missing field is encountered. |
| `FIELD_DEFAULT` | Uses the default value of the field for missing fields. |
| `FIELD_DEFAULT` | Uses the default value of the field for missing fields. |

### USE_LOGIC_TYPE (Load Only)

Controls how temporal data types (date and timestamp) are interpreted during loading.

| Available Values | Description |
|------------------|----------------------------------------------------------------------------------------------------------------------------|
| `TRUE` (Default) | Date and timestamp values are loaded as their logical data types (DATE and TIMESTAMP). |
| `FALSE` | Date and timestamp values are loaded as raw integer values (INT32 for dates, INT64 for timestamps). |
Loading