Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve coverage of tests for array expressions #1269

Open
8 tasks
andygrove opened this issue Jan 12, 2025 · 4 comments
Open
8 tasks

Improve coverage of tests for array expressions #1269

andygrove opened this issue Jan 12, 2025 · 4 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@andygrove
Copy link
Member

andygrove commented Jan 12, 2025

What is the problem the feature request solves?

I would like to start implementing some shared test code for array expressions to make it easier to test for edge cases such as null or empty arrays.

The following functions need improved tests:

  • array_remove
  • array_contains
  • array_append
  • array_intersect
  • array_join
  • array_repeat
  • array_compact
  • arrays_overlap

Describe the potential solution

No response

Additional context

No response

@andygrove
Copy link
Member Author

We should also start testing with all supported data types for the array expressions.

@andygrove
Copy link
Member Author

Here is an example of an issue that we should have been able to catch using fuzz testing:

#1307

@NoeB
Copy link
Contributor

NoeB commented Jan 19, 2025

Would it help to extend makeParquetFileAllTypes or create a new method which creates a parquet file that contains more complex structures like lists (with all datatypes and structs because the current tests usually test the implementation with arrays only containing 1-3 elements) and structs to improve the tests with additional test cases?

However I think that requires complex type support for the comet native parquet reader.

@andygrove
Copy link
Member Author

Would it help to extend makeParquetFileAllTypes or create a new method which creates a parquet file that contains more complex structures like lists (with all datatypes and structs because the current tests usually test the implementation with arrays only containing 1-3 elements) and structs to improve the tests with additional test cases?

However I think that requires complex type support for the comet native parquet reader.

Yes, exactly. I have started down this path in #1308

As you said, until we have complex type support we cannot generate Parquet files containing complex types. For now I had to work around this by generating a Parquet file with primitive types and then use the array expression in SQL to create arrays to pass into array_remove.

Complex type support for Parquet is coming very soon though.

I did try disabling native Parquet scans and use the COMET_CONVERT_FROM_PARQUET_ENABLED feature, but this failed with an error that I need to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants