Skip to content

Conversation

@AryanBagade
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

As per the Issue #19039, many examples of verbose CreateExternalTable initialization were found throughout the codebase. The current approach requires specifying all 14 fields even when most use default values, making it:

  1. Verbose (most fields are defaults)
  2. Hard to see which fields are actually overridden vs defaults

This PR implements a builder API to address these issues.

What changes are included in this PR?

  • Added CreateExternalTableBuilder with required parameters enforced at compile-time
  • Constructor takes required fields: builder(name, location, file_type, schema)
  • Optional fields can be set via .with_*() methods (following DataFusion's builder conventions)
  • Updated all usages across 5 files to use the new builder API
  • Reduced typical usage from 15 lines to 4-6 lines

Example transformation:
Before

// Before (15 lines)
CreateExternalTable {
    name, location, file_type: "parquet".to_string(),
    schema: Arc::new(DFSchema::empty()),
    table_partition_cols: vec![], if_not_exists: false,
    or_replace: false, temporary: false, definition: None,
    order_exprs: vec![], unbounded: false,
    options: HashMap::new(), constraints: Default::default(),
    column_defaults: HashMap::new(),
}

After

  // After (4 lines)
  CreateExternalTable::builder(name, location, "parquet",
  schema)
      .build()

Are these changes tested?

Yup, all existing tests pass:

  • cargo test -p datafusion-expr - DDL tests pass
  • cargo test -p datafusion-catalog - Catalog tests pass
  • cargo test -p datafusion - Core tests pass
  • All modified usage sites are in existing test files

Are there any user-facing changes?

No breaking changes.

The builder API is new one and its for internal code cleanup and reducing redundancy and verboseness

Implements apache#19039

This PR adds a builder pattern for CreateExternalTable that reduces code verbosity and improves readability.
@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate catalog Related to the catalog crate proto Related to proto crate labels Dec 3, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @AryanBagade -- this looks really nice to me 🙏 ❤️

@AryanBagade AryanBagade requested a review from alamb December 3, 2025 23:54
@AryanBagade
Copy link
Contributor Author

@alamb The CI failure was due to the merge from main bringing in new tests (lines 502, 541) that still used the old CreateExternalTable { ... } pattern. I've updated those to use the builder as well.

@alamb
Copy link
Contributor

alamb commented Dec 4, 2025

CI failure is due to

Should be fixed soon

@alamb
Copy link
Contributor

alamb commented Dec 5, 2025

@alamb alamb added this pull request to the merge queue Dec 5, 2025
@alamb
Copy link
Contributor

alamb commented Dec 5, 2025

Thank you @AryanBagade 🙏

Merged via the queue into apache:main with commit 2b05b09 Dec 5, 2025
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate core Core DataFusion crate logical-expr Logical plan and expressions proto Related to proto crate sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add builder API for CreateExternalTable to reduce redundancy and make the code easier to understand

2 participants