-
Notifications
You must be signed in to change notification settings - Fork 0
Schema Types
When creating a Parquet file using parquet-java, you can specify the schema types for your data fields. The available schema types are:
- requiredGroup
- repeatedGroup
- required
- repeated
- optionalGroup
- optional
Below is a brief explanation of each schema type.
A requiredGroup is a group of fields (a nested schema) that must be present in every record and cannot be null.
The fields within this group can have their own repetition levels (required, optional, or repeated).
A repeatedGroup represents a group that can occur zero or more times,
effectively modeling a list or array of nested records.
A required field must be present in every record and cannot be null. This ensures that the field always contains a value.
A repeated field can have zero or more values, modeling a list or array of values of the same type.
An optionalGroup is a group of fields that may or may not be present in a record. The entire group can be null.
An optional field may or may not be present in a record. If the field is not present, it is considered null.
Developed and maintained by the Altinity team.
- Home
- Parquet File Name
- Options of the File
- File Compression
- Writer Version
- Row and Page Size
- Bloom Filter
- Configure with Hadoop
- Integer Columns
- Unsigned Integer Columns
- UTF8 Columns
- Decimal Columns
- Date Columns
- Time and Timestamp Columns
- JSON and BSON Columns
- String Columns
- Enum Columns
- UUID Columns
- Float16 Column
- Array Columns
- Nested Array Columns
- Tuple Columns
- Nested Tuple Columns
- Schema Types
- Encodings
- File Encryption
- Extra Metadata Entries