Skip to content

Refactor error handling to use boxed errors for DataFusionError variants #16672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 4, 2025

Conversation

kosiew
Copy link
Contributor

@kosiew kosiew commented Jul 3, 2025

Which issue does this PR close?

Rationale for this change

This change standardizes the internal representation of several DataFusionError variants by wrapping them in Box<T>. This:

  • Improves performance by reducing the memory footprint of the DataFusionError enum.
  • Enables more consistent and idiomatic error handling.
  • Complies with Clippy recommendations for large error enums.
  • Simplifies pattern matching and variant propagation logic.

What changes are included in this PR?

  • Refactored the DataFusionError enum to use Box<T> for:
    • ArrowError
    • ParquetError
    • AvroError
    • object_store::Error
    • ParserError
    • SchemaError
    • JoinError
  • Updated all relevant match arms and constructors to handle boxed errors.
  • Refactored error-related macros (arrow_datafusion_err!, sql_datafusion_err!, etc.) to use Box<T>.
  • Adjusted test cases and error assertions for boxed variants.
  • Documentation update to the upgrade guide to explain the required changes and rationale.

Are these changes tested?

Yes. These changes are covered by existing unit and integration tests, which validate correct error handling and propagation. Additional assertions were added where necessary to handle boxed variants.

Are there any user-facing changes?

No. This refactor maintains API compatibility and does not introduce breaking changes for users. Error messages and types remain consistent from the user's perspective.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate catalog Related to the catalog crate common Related to common crate datasource Changes to the datasource crate labels Jul 3, 2025
@comphead
Copy link
Contributor

comphead commented Jul 3, 2025

cc @crepererum
There is another PR pending for SchemaError #16653

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing -- thank you @kosiew

I recommend:

  1. Adding a test to ensure the size of DataFusionError does not get larger (similar to the one added by @crepererum in #16653 )
  2. Add a note in the upgrade guide (in #16673)

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jul 4, 2025
@kosiew
Copy link
Contributor Author

kosiew commented Jul 4, 2025

@alamb

Adding a test to ensure the size of DataFusionError does not get larger (similar to the one added by @crepererum in #16653 )
Add a note in the upgrade guide (in #16673)

Added test and upgrade note.

@kosiew kosiew merged commit acf0bbe into apache:main Jul 4, 2025
29 checks passed
Standing-Man pushed a commit to Standing-Man/datafusion that referenced this pull request Jul 4, 2025
…nts (apache#16672)

- Refactored the `DataFusionError` enum to use `Box<T>` for:
  - `ArrowError`
  - `ParquetError`
  - `AvroError`
  - `object_store::Error`
  - `ParserError`
  - `SchemaError`
  - `JoinError`
- Updated all relevant match arms and constructors to handle boxed errors.
- Refactored error-related macros (`arrow_datafusion_err!`, `sql_datafusion_err!`, etc.) to use `Box<T>`.
- Adjusted test cases and error assertions for boxed variants.
- Documentation update to the upgrade guide to explain the required changes and rationale.
@alamb alamb added the api change Changes the API exposed to users of the crate label Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate catalog Related to the catalog crate common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants