Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Capture nullability in Values node planning #14472

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

rkrishn7
Copy link
Contributor

@rkrishn7 rkrishn7 commented Feb 4, 2025

Which issue does this PR close?

What changes are included in this PR?

Includes field nullability from table schema when planning a Values node during INSERT INTO ... VALUES

Are these changes tested?

Existing tests have been updated, but it would be great to figure out additional tests to add.

Are there any user-facing changes?

N/A

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) labels Feb 4, 2025
@alamb alamb marked this pull request as draft February 4, 2025 18:46
@alamb
Copy link
Contributor

alamb commented Feb 4, 2025

Thank you @rkrishn7 -- I marked this PR as a draft as it seems a CI test is failing. Thank you!

@alamb
Copy link
Contributor

alamb commented Feb 4, 2025

FYI @gatesn

@rkrishn7
Copy link
Contributor Author

rkrishn7 commented Feb 4, 2025

Thanks @alamb. Will fix it up when I'm back from work!

@rkrishn7
Copy link
Contributor Author

rkrishn7 commented Feb 5, 2025

Rust / build with wasm-pack (pull_request) is currently failing, but looks like this is due to the recent upstream release of uuid. Should be fixed in #14494

@rkrishn7 rkrishn7 marked this pull request as ready for review February 5, 2025 08:00
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @rkrishn7 -- this looks good to me ❤️
i will restart the failed CI check to get a clean run for this PR

}

fn infer_inner(
mut values: Vec<Vec<Expr>>,
field_types: &[DataType],
field_nullabilities: &[bool],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way that you could do this is to pass in the fields

like

    fn infer_inner(
        mut values: Vec<Vec<Expr>>,
        field_types: &[&Field],
        schema: &DFSchema
    ) -> Result<Self> {

And then instead of making a new field

            .map(|(j, (data_type, nullable))| {
                // naming is following convention https://www.postgresql.org/docs/current/queries-values.html
                let name = &format!("column{}", j + 1);
                Field::new(name, data_type.clone(), *nullable)
            })

Do something like

            .map(|(j, field)| {
                // naming is following convention https://www.postgresql.org/docs/current/queries-values.html
                field.clone().with_name(format!("column{}", j + 1))
            })

Copy link
Contributor Author

@rkrishn7 rkrishn7 Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good call out!

We end up cloning the data type to construct the field, so I just created a small container which encapsulates the naming logic as well. Think it cleans things up a bit!

@@ -255,7 +255,7 @@ insert into table_without_values(id, id) values(3, 3);
statement error Arrow error: Cast error: Cannot cast string 'zoo' to value of Int64 type
insert into table_without_values(name, id) values(4, 'zoo');

statement error Error during planning: Column count doesn't match insert query!
statement error Error during planning: Inconsistent data length across values list: got 2 values in row 0 but expected 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem significantly better errors ❤️

@alamb
Copy link
Contributor

alamb commented Feb 5, 2025

I merged up from main to get a clean CI run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataSink::write_all given invalid RecordBatchStream
2 participants