You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What changes were proposed in this pull request?
Support `DESCRIBE TABLE ... [AS JSON]` to optionally display table metadata in JSON format.
**SQL Ref Spec:**
{ DESC | DESCRIBE } [ TABLE ] [ EXTENDED | FORMATTED ] table_name { [ PARTITION clause ] | [ column_name ] } **[ AS JSON ]**
Output:
json_metadata: String
### Why are the changes needed?
The Spark SQL command `DESCRIBE TABLE` displays table metadata in a DataFrame format geared toward human consumption. This format causes parsing challenges, e.g. if fields contain special characters or the format changes as new features are added.
The new `AS JSON` option would return the table metadata as a JSON string that supports parsing via machine, while being extensible with a minimized risk of breaking changes. It is not meant to be human-readable.
### Does this PR introduce _any_ user-facing change?
Yes, this provides a new option to display DESCRIBE TABLE metadata in JSON format. See below (and updated golden files) for the JSON output schema:
```
{
"table_name": "<table_name>",
"catalog_name": "<catalog_name>",
"schema_name": "<innermost_schema_name>",
"namespace": ["<innermost_schema_name>"],
"type": "<table_type>",
"provider": "<provider>",
"columns": [
{
"name": "<name>",
"type": <type_json>,
"comment": "<comment>",
"nullable": <boolean>,
"default": "<default_val>"
}
],
"partition_values": {
"<col_name>": "<val>"
},
"location": "<path>",
"view_text": "<view_text>",
"view_original_text": "<view_original_text>",
"view_schema_mode": "<view_schema_mode>",
"view_catalog_and_namespace": "<view_catalog_and_namespace>",
"view_query_output_columns": ["col1", "col2"],
"owner": "<owner>",
"comment": "<comment>",
"table_properties": {
"property1": "<property1>",
"property2": "<property2>"
},
"storage_properties": {
"property1": "<property1>",
"property2": "<property2>"
},
"serde_library": "<serde_library>",
"input_format": "<input_format>",
"output_format": "<output_format>",
"num_buckets": <num_buckets>,
"bucket_columns": ["<col_name>"],
"sort_columns": ["<col_name>"],
"created_time": "<timestamp_ISO-8601>",
"last_access": "<timestamp_ISO-8601>",
"partition_provider": "<partition_provider>"
}
```
### How was this patch tested?
- Updated golden files for `describe.sql`
- Added tests in `DescribeTableParserSuite.scala`, `DescribeTableSuite.scala`, `PlanResolutionSuite.scala`
### Was this patch authored or co-authored using generative AI tooling?
Closes#49139 from asl3/asl3/describetableasjson.
Authored-by: Amanda Liu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
0 commit comments