HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result #6060

mdayakar · 2025-09-04T16:31:43Z

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result

What changes were proposed in this pull request?

Here when the data is selected from a view with (([1008753865](tel:1008753865))/(0)) IS NOT NULL filter condition which actually results to FALSE should not return any rows but actually it us returning rows, where as with same filter condition selecting data from a table working fine, not giving any rows as a result.

This is due to HiveFilterProjectTransposeRule rule which is the first rule in the list is getting applied for view query and finally the filter condition is getting removed so it is returning the rows where as for table query ReduceExpressionsRule.FilterReduceExpressionsRule is getting applied and the plan is getting changed to HiveValues(tuples=[[]]) so no rows are getting fetched.

So to fix the issue ReduceExpressionsRule.FilterReduceExpressionsRule can be added before HiveFilterProjectTransposeRule so that ReduceExpressionsRule.FilterReduceExpressionsRule will get applied for view query also and gives proper output.

Why are the changes needed?

To fix the issue mentioned in HIVE-29181

Does this PR introduce any user-facing change?

No

How was this patch tested?

Using q file tests
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=view_with_where_exp.q -pl itests/qtest -Pitests

…unexpected result

github-actions · 2025-09-15T08:27:10Z

@check-spelling-bot Report

🔴 Please review

See the files view or the action log for details.

Unrecognized words (3)

bucketedtables
languagemanual
teradatabinaryserde

Previously acknowledged words that are now absent

aarry bytecode cwiki HIVEFETCHOUTPUTSERDE timestamplocal yyyy

To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the [email protected]:mdayakar/hive.git repository
on the HIVE-29181_SelectViewIssue branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
-H "Content-Type: application/json" \
"https://api.github.com/repos/apache/hive/issues/comments/3291011158" > "$comment_json"
comment_body=$(mktemp)
jq -r ".body // empty" "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")

patch_add=$(perl -e '$/=undef; $_=<>; if (m{Unrecognized words[^<]*</summary>\n*```\n*([^<]*)```\n*</details>$}m) { print "$1" } elsif (m{Unrecognized words[^<]*\n\n((?:\w.*\n)+)\n}m) { print "$1" };' < "$comment_body")

update_files
rm $comment_body
git add -u

If the flagged items do not appear to be text

If items relate to a ...

well-formed pattern.

If you can write a pattern that would match it,
try adding it to the patterns.txt file.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.
binary file.

Please add a file path to the excludes.txt file matching the containing file.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

sonarqubecloud · 2025-09-15T11:09:09Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

thomasrebele · 2025-09-17T08:39:18Z

ql/src/test/queries/clientpositive/view_with_where_exp.q

Could we simplify the test case, please? E.g., by removing unnecessary columns, using simpler constants. The following test case reproduces the issue:

CREATE TABLE IF NOT EXISTS t0(col DOUBLE); CREATE VIEW v0 AS (SELECT ALL (t0.col) AS col FROM t0); INSERT INTO t0(col) VALUES(0.1); explain cbo SELECT t0.col from t0 WHERE ((1000)/(0)) IS NOT NULL; SELECT t0.col from t0 WHERE ((1000)/(0)) IS NOT NULL; explain cbo SELECT v0.col FROM v0 WHERE ((1000)/(0)) IS NOT NULL; SELECT v0.col FROM v0 WHERE ((1000)/(0)) IS NOT NULL;

I am also curious to know if the datatype (i.e., DOUBLE) is important for reproducing the problem. IF not then probably I would pick something more straightforward like an INT or STRING.

thomasrebele · 2025-09-17T08:50:30Z

ql/src/test/results/clientpositive/llap/explainuser_1.q.out

      Reducer 3 llap
      File Output Operator [FS_16]
-        Select Operator [SEL_15] (rows=24 width=100)
+        Select Operator [SEL_15] (rows=21 width=100)


I guess the row count are just estimations, not the real row count of the executed query?

zabetak · 2025-09-17T12:54:32Z

The description mentions that the filter condition is removed at some point. Who does the removal and why? It feels wrong to remove a filter that is always false so I want to understand a bit better what happens.

zabetak · 2025-09-17T12:56:44Z

ql/src/test/queries/clientpositive/view_with_where_exp.q

I am also curious to know if the datatype (i.e., DOUBLE) is important for reproducing the problem. IF not then probably I would pick something more straightforward like an INT or STRING.

zabetak · 2025-09-17T13:04:00Z

ql/src/test/results/clientpositive/llap/ppd_gby_join.q.out

                TableScan
                  alias: src
-                  filterExpr: (((value < 'val_50') or (key > '2')) and (((key > '20') and (key < '4')) or ((key > '4') and (key < '400')))) (type: boolean)
+                  filterExpr: (((value < 'val_50') or key is not null) and (((key > '20') and (key < '4')) or ((key > '4') and (key < '400')))) (type: boolean)


Is the simplification of key > '2' to key is not null valid?

zabetak · 2025-09-17T13:09:03Z

ql/src/test/results/clientpositive/llap/subquery_notin.q.out

+                      Filter Operator
+                        predicate: UDFToDouble(_col0) is not null (type: boolean)


Rather minor but it seems that now some filter operators cannot be merged together.

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing …

a123c4c

…unexpected result

asf-ci-hive added tests pending tests unstable and removed tests pending labels Sep 4, 2025

Fixed test failures

aa1fc78

asf-ci-hive added tests pending tests unstable and removed tests unstable tests pending labels Sep 11, 2025

Fixed test failures

fc84c7b

asf-ci-hive added tests pending and removed tests unstable labels Sep 15, 2025

asf-ci-hive added tests passed and removed tests pending labels Sep 15, 2025

thomasrebele reviewed Sep 17, 2025

View reviewed changes

zabetak reviewed Sep 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result #6060

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result #6060

Uh oh!

mdayakar commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 15, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Sep 15, 2025

Uh oh!

thomasrebele Sep 17, 2025

Uh oh!

zabetak Sep 17, 2025

Uh oh!

thomasrebele Sep 17, 2025

Uh oh!

zabetak commented Sep 17, 2025

Uh oh!

zabetak Sep 17, 2025

Uh oh!

zabetak Sep 17, 2025

Uh oh!

zabetak Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		Filter Operator
		predicate: UDFToDouble(_col0) is not null (type: boolean)

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result #6060

Are you sure you want to change the base?

HIVE-29181: SELECT query on VIEW with IS NOT NULL operator producing unexpected result #6060

Uh oh!

Conversation

mdayakar commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

@check-spelling-bot Report

🔴 Please review

Unrecognized words (3)

Uh oh!

sonarqubecloud bot commented Sep 15, 2025

Quality Gate passed

Uh oh!

thomasrebele Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

zabetak Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

thomasrebele Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

zabetak commented Sep 17, 2025

Uh oh!

zabetak Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

zabetak Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

zabetak Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mdayakar commented Sep 4, 2025 •

edited

Loading

github-actions bot commented Sep 15, 2025 •

edited

Loading