Skip to content

Add a way to show the contents of the FileStatisticsCache in datafusion-cli #18953

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

I am trying to understand if/when the FileStatisticsCache cache is used

However, now there is no way to understand the contents of the cache.

@nuno-faria made a really nice feature to view the contents of the cache: metadata_cache()

For example:

> select * from metadata_cache();
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
| path                                              | file_modified       | file_size_bytes | e_tag                                | version | metadata_size_bytes | hits | extra            |
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
| hits_compatible/athena_partitioned/hits_1.parquet | 2022-07-03T15:33:57 | 174965044       | "1f5da68e097309811a675c849491ac48-9" | NULL    | 165128              | 0    | page_index=false |
+---------------------------------------------------+---------------------+-----------------+--------------------------------------+---------+---------------------+------+------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.

Describe the solution you'd like

I would like a table function similar to metadata_cache() for the statistics cache.

Someting like

select * from statistics_cache();
path file_modified file_size_bytes e_tag statistics_size_bytes
/foo/bar 2022-07-03T15:33:57 1234 ... 132
/foo/baz 2022-07-03T15:33:57 5678 ... 3112
... ... ... ... ...

Where statistics_size_bytes shows the size of the statistics, in bytes

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions