Skip to content

ESQL - Add K mandatory param for KNN function #129763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jul 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
d25776a
Add k as a param
carlosdelest Jun 20, 2025
bdfc1c2
Add testing
carlosdelest Jun 20, 2025
ab1943e
Add CSV tests
carlosdelest Jun 20, 2025
5d401fc
Remove knnSearchWithKOption as it makes no sense now
carlosdelest Jun 20, 2025
3522afb
Check k is a constant
carlosdelest Jun 20, 2025
3dac389
Fix ITs
carlosdelest Jun 20, 2025
0e9f2e6
Spotless
carlosdelest Jun 20, 2025
6a3d6f7
Fix tests
carlosdelest Jun 20, 2025
2d454ca
Fix capability for multi cluster tests
carlosdelest Jun 20, 2025
de8c564
Merge remote-tracking branch 'elasticsearch/main' into non-issue/knn-…
ioanatia Jun 26, 2025
d54ce01
muted-tests merge conflicts
ioanatia Jun 26, 2025
c659629
Merge branch 'main' into non-issue/knn-k-param
ioanatia Jun 26, 2025
123f2e4
Merge branch 'main' into non-issue/knn-k-param
ioanatia Jun 27, 2025
a6faf49
Removed k param from serialization to avoid TransportVersions change
carlosdelest Jul 1, 2025
94f1524
Merge remote-tracking branch 'origin/main' into non-issue/knn-k-param
carlosdelest Jul 1, 2025
2c864e9
Merge remote-tracking branch 'carlosdelest/non-issue/knn-k-param' int…
carlosdelest Jul 1, 2025
f4356a2
Fix merge
carlosdelest Jul 1, 2025
4c94b4a
Fix error for multi cluster tests, where k can already have been rewr…
carlosdelest Jul 1, 2025
28ef9c3
Merge branch 'main' into non-issue/knn-k-param
carlosdelest Jul 2, 2025
9f3ff2f
Merge remote-tracking branch 'origin/main' into non-issue/knn-k-param
carlosdelest Jul 2, 2025
dfa0c3c
Merge branch 'main' into non-issue/knn-k-param
carlosdelest Jul 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 35 additions & 6 deletions muted-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,9 @@ tests:
- class: org.elasticsearch.packaging.test.DockerTests
method: test012SecurityCanBeDisabled
issue: https://github.com/elastic/elasticsearch/issues/116636
- class: org.elasticsearch.index.shard.StoreRecoveryTests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad merge? same as the other ones that are added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 ouch. Sorry about that.

Opened #130523 to fix

method: testAddIndices
issue: https://github.com/elastic/elasticsearch/issues/124104
- class: org.elasticsearch.smoketest.MlWithSecurityIT
method: test {yaml=ml/data_frame_analytics_crud/Test get stats on newly created config}
issue: https://github.com/elastic/elasticsearch/issues/121726
Expand Down Expand Up @@ -455,6 +458,12 @@ tests:
- class: org.elasticsearch.packaging.test.DockerTests
method: test073RunEsAsDifferentUserAndGroupWithoutBindMounting
issue: https://github.com/elastic/elasticsearch/issues/128996
- class: org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT
method: test {p0=upgraded_cluster/70_ilm/Test Lifecycle Still There And Indices Are Still Managed}
issue: https://github.com/elastic/elasticsearch/issues/129097
- class: org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT
method: test {p0=upgraded_cluster/90_ml_data_frame_analytics_crud/Get mixed cluster outlier_detection job}
issue: https://github.com/elastic/elasticsearch/issues/129098
- class: org.elasticsearch.packaging.test.DockerTests
method: test081SymlinksAreFollowedWithEnvironmentVariableFiles
issue: https://github.com/elastic/elasticsearch/issues/128867
Expand All @@ -473,21 +482,27 @@ tests:
- class: org.elasticsearch.entitlement.runtime.policy.FileAccessTreeTests
method: testWindowsAbsolutPathAccess
issue: https://github.com/elastic/elasticsearch/issues/129168
- class: org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is removed in this PR as it's already being tested in all other tests

method: test {knn-function.KnnSearchWithKOption ASYNC}
issue: https://github.com/elastic/elasticsearch/issues/129447
- class: org.elasticsearch.xpack.ml.integration.ClassificationIT
method: testWithDatastreams
issue: https://github.com/elastic/elasticsearch/issues/129457
- class: org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests
method: testMergeTasksAreUnblockedWhenMoreDiskSpaceBecomesAvailable
issue: https://github.com/elastic/elasticsearch/issues/129296
- class: org.elasticsearch.xpack.security.PermissionsIT
method: testCanManageIndexWithNoPermissions
issue: https://github.com/elastic/elasticsearch/issues/129471
- class: org.elasticsearch.xpack.security.PermissionsIT
method: testCanManageIndexAndPolicyDifferentUsers
issue: https://github.com/elastic/elasticsearch/issues/129479
- class: org.elasticsearch.xpack.security.PermissionsIT
method: testCanViewExplainOnUnmanagedIndex
issue: https://github.com/elastic/elasticsearch/issues/129480
- class: org.elasticsearch.xpack.profiling.action.GetStatusActionIT
method: testWaitsUntilResourcesAreCreated
issue: https://github.com/elastic/elasticsearch/issues/129486
- class: org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT
method: test {knn-function.KnnSearchWithKOption SYNC}
issue: https://github.com/elastic/elasticsearch/issues/129512
- class: org.elasticsearch.xpack.security.PermissionsIT
method: testWhenUserLimitedByOnlyAliasOfIndexCanWriteToIndexWhichWasRolledoverByILMPolicy
issue: https://github.com/elastic/elasticsearch/issues/129481
- class: org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceTests
method: testIORateIsAdjustedForAllRunningMergeTasks
issue: https://github.com/elastic/elasticsearch/issues/129531
Expand All @@ -503,15 +518,24 @@ tests:
- class: org.elasticsearch.search.query.VectorIT
method: testFilteredQueryStrategy
issue: https://github.com/elastic/elasticsearch/issues/129517
- class: org.elasticsearch.snapshots.SnapshotShutdownIT
method: testSnapshotShutdownProgressTracker
issue: https://github.com/elastic/elasticsearch/issues/129752
- class: org.elasticsearch.xpack.security.SecurityRolesMultiProjectIT
method: testUpdatingFileBasedRoleAffectsAllProjects
issue: https://github.com/elastic/elasticsearch/issues/129775
- class: org.elasticsearch.qa.verify_version_constants.VerifyVersionConstantsIT
method: testLuceneVersionConstant
issue: https://github.com/elastic/elasticsearch/issues/125638
- class: org.elasticsearch.index.store.FsDirectoryFactoryTests
method: testPreload
issue: https://github.com/elastic/elasticsearch/issues/129852
- class: org.elasticsearch.xpack.rank.rrf.RRFRankClientYamlTestSuiteIT
method: test {yaml=rrf/950_pinned_interaction/rrf with pinned retriever as a sub-retriever}
issue: https://github.com/elastic/elasticsearch/issues/129845
- class: org.elasticsearch.xpack.test.rest.XPackRestIT
method: test {p0=esql/60_usage/Basic ESQL usage output (telemetry) non-snapshot version}
issue: https://github.com/elastic/elasticsearch/issues/129888
- class: org.elasticsearch.gradle.internal.InternalDistributionBwcSetupPluginFuncTest
method: "builds distribution from branches via archives extractedAssemble [bwcDistVersion: 8.2.1, bwcProject: bugfix, expectedAssembleTaskName:
extractedAssemble, #2]"
Expand All @@ -525,9 +549,14 @@ tests:
- class: org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT
method: test
issue: https://github.com/elastic/elasticsearch/issues/130067
- class: geoip.GeoIpMultiProjectIT
issue: https://github.com/elastic/elasticsearch/issues/130073
- class: org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT
method: test
issue: https://github.com/elastic/elasticsearch/issues/130067
- class: org.elasticsearch.xpack.esql.action.EnrichIT
method: testTopN
issue: https://github.com/elastic/elasticsearch/issues/130122
- class: org.elasticsearch.action.support.ThreadedActionListenerTests
method: testRejectionHandling
issue: https://github.com/elastic/elasticsearch/issues/130129
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
# top-n query at the shard level

knnSearch
required_capability: knn_function
required_capability: knn_function_v2

// tag::knn-function[]
from colors metadata _score
| where knn(rgb_vector, [0, 120, 0])
| where knn(rgb_vector, [0, 120, 0], 10)
| sort _score desc, color asc
// end::knn-function[]
| keep color, rgb_vector
Expand All @@ -29,31 +29,12 @@ chartreuse | [127.0, 255.0, 0.0]
// end::knn-function-result[]
;

knnSearchWithKOption
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test - k is already added to all other tests

required_capability: knn_function

// tag::knn-function-options[]
from colors metadata _score
| where knn(rgb_vector, [0,255,255], {"k": 4})
| sort _score desc, color asc
// end::knn-function-options[]
| keep color, rgb_vector
| limit 4
;

color:text | rgb_vector:dense_vector
cyan | [0.0, 255.0, 255.0]
turquoise | [64.0, 224.0, 208.0]
aqua marine | [127.0, 255.0, 212.0]
teal | [0.0, 128.0, 128.0]
;

# https://github.com/elastic/elasticsearch/issues/129550
# https://github.com/elastic/elasticsearch/issues/129550 - Add as an example to knn function documentation
knnSearchWithSimilarityOption-Ignore
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| where knn(rgb_vector, [255,192,203], {"k": 140, "similarity": 40})
| where knn(rgb_vector, [255,192,203], 140, {"similarity": 40})
| sort _score desc, color asc
| keep color, rgb_vector
;
Expand All @@ -63,14 +44,13 @@ pink | [255.0, 192.0, 203.0]
peach puff | [255.0, 218.0, 185.0]
bisque | [255.0, 228.0, 196.0]
wheat | [245.0, 222.0, 179.0]

;

knnHybridSearch
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| where match(color, "blue") or knn(rgb_vector, [65,105,225], {"k": 140})
| where match(color, "blue") or knn(rgb_vector, [65,105,225], 140)
| where primary == true
| sort _score desc, color asc
| keep color, rgb_vector
Expand All @@ -90,10 +70,10 @@ yellow | [255.0, 255.0, 0.0]
;

knnWithMultipleFunctions
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| where knn(rgb_vector, [128,128,0], {"k": 140}) and match(color, "olive")
| where knn(rgb_vector, [128,128,0], 140) and match(color, "olive")
| sort _score desc, color asc
| keep color, rgb_vector
;
Expand All @@ -103,11 +83,11 @@ olive | [128.0, 128.0, 0.0]
;

knnAfterKeep
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| keep rgb_vector, color, _score
| where knn(rgb_vector, [128,255,0], {"k": 140})
| where knn(rgb_vector, [128,255,0], 140)
| sort _score desc, color asc
| keep rgb_vector
| limit 5
Expand All @@ -122,11 +102,11 @@ rgb_vector:dense_vector
;

knnAfterDrop
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| drop primary
| where knn(rgb_vector, [128,250,0], {"k": 140})
| where knn(rgb_vector, [128,250,0], 140)
| sort _score desc, color asc
| keep color, rgb_vector
| limit 5
Expand All @@ -141,11 +121,11 @@ lime | [0.0, 255.0, 0.0]
;

knnAfterEval
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| eval composed_name = locate(color, " ") > 0
| where knn(rgb_vector, [128,128,0], {"k": 140})
| where knn(rgb_vector, [128,128,0], 140)
| sort _score desc, color asc
| keep color, composed_name
| limit 5
Expand All @@ -160,11 +140,11 @@ golden rod | true
;

knnWithConjunction
required_capability: knn_function
required_capability: knn_function_v2

# TODO We need kNN prefiltering here so we get more candidates that pass the filter
from colors metadata _score
| where knn(rgb_vector, [255,255,238], {"k": 140}) and hex_code like "#FFF*"
| where knn(rgb_vector, [255,255,238], 140) and hex_code like "#FFF*"
| sort _score desc, color asc
| keep color, hex_code, rgb_vector
| limit 10
Expand All @@ -181,11 +161,11 @@ yellow | #FFFF00 | [255.0, 255.0, 0.0]
;

knnWithDisjunctionAndFiltersConjunction
required_capability: knn_function
required_capability: knn_function_v2

# TODO We need kNN prefiltering here so we get more candidates that pass the filter
from colors metadata _score
| where (knn(rgb_vector, [0,255,255], {"k": 140}) or knn(rgb_vector, [128, 0, 255], {"k": 140})) and primary == true
| where (knn(rgb_vector, [0,255,255], 140) or knn(rgb_vector, [128, 0, 255], 140)) and primary == true
| keep color, rgb_vector, _score
| sort _score desc, color asc
| drop _score
Expand All @@ -205,11 +185,11 @@ yellow | [255.0, 255.0, 0.0]
;

knnWithNonPushableConjunction
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| eval composed_name = locate(color, " ") > 0
| where knn(rgb_vector, [128,128,0], {"k": 140}) and composed_name == false
| where knn(rgb_vector, [128,128,0], 140) and composed_name == false
| sort _score desc, color asc
| keep color, composed_name
| limit 10
Expand All @@ -230,10 +210,10 @@ maroon | false

# https://github.com/elastic/elasticsearch/issues/129550
testKnnWithNonPushableDisjunctions-Ignore
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| where knn(rgb_vector, [128,128,0], {"k": 140, "similarity": 30}) or length(color) > 10
| where knn(rgb_vector, [128,128,0], 140, {"similarity": 30}) or length(color) > 10
| sort _score desc, color asc
| keep color
;
Expand All @@ -247,10 +227,10 @@ papaya whip

# https://github.com/elastic/elasticsearch/issues/129550
testKnnWithNonPushableDisjunctionsOnComplexExpressions-Ignore
required_capability: knn_function
required_capability: knn_function_v2

from colors metadata _score
| where (knn(rgb_vector, [128,128,0], {"k": 140, "similarity": 70}) and length(color) < 10) or (knn(rgb_vector, [128,0,128], {"k": 140, "similarity": 60}) and primary == false)
| where (knn(rgb_vector, [128,128,0], 140, {"similarity": 70}) and length(color) < 10) or (knn(rgb_vector, [128,0,128], 140, {"similarity": 60}) and primary == false)
| sort _score desc, color asc
| keep color, primary
;
Expand All @@ -262,24 +242,24 @@ indigo | false
;

testKnnInStatsNonPushable
required_capability: knn_function
required_capability: knn_function_v2

from colors
| where length(color) < 10
| stats c = count(*) where knn(rgb_vector, [128,128,255], {"k": 140})
| stats c = count(*) where knn(rgb_vector, [128,128,255], 140)
;

c: long
50
;

testKnnInStatsWithGrouping
required_capability: knn_function
required_capability: knn_function_v2
required_capability: full_text_functions_in_stats_where

from colors
| where length(color) < 10
| stats c = count(*) where knn(rgb_vector, [128,128,255], {"k": 140}) by primary
| stats c = count(*) where knn(rgb_vector, [128,128,255], 140) by primary
;

c: long | primary: boolean
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ public void testKnnDefaults() {

var query = String.format(Locale.ROOT, """
FROM test METADATA _score
| WHERE knn(vector, %s)
| WHERE knn(vector, %s, 10)
| KEEP id, floats, _score, vector
| SORT _score DESC
""", Arrays.toString(queryVector));
Expand Down Expand Up @@ -73,7 +73,7 @@ public void testKnnOptions() {

var query = String.format(Locale.ROOT, """
FROM test METADATA _score
| WHERE knn(vector, %s, {"k": 5})
| WHERE knn(vector, %s, 5)
| KEEP id, floats, _score, vector
| SORT _score DESC
""", Arrays.toString(queryVector));
Expand All @@ -94,7 +94,7 @@ public void testKnnNonPushedDown() {
// TODO we need to decide what to do when / if user uses k for limit, as no more than k results will be returned from knn query
var query = String.format(Locale.ROOT, """
FROM test METADATA _score
| WHERE knn(vector, %s, {"k": 5}) OR id > 10
| WHERE knn(vector, %s, 5) OR id > 10
| KEEP id, floats, _score, vector
| SORT _score DESC
""", Arrays.toString(queryVector));
Expand All @@ -111,7 +111,7 @@ public void testKnnNonPushedDown() {

@Before
public void setup() throws IOException {
assumeTrue("Needs KNN support", EsqlCapabilities.Cap.KNN_FUNCTION.isEnabled());
assumeTrue("Needs KNN support", EsqlCapabilities.Cap.KNN_FUNCTION_V2.isEnabled());

var indexName = "test";
var client = client().admin().indices();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1195,7 +1195,7 @@ public enum Cap {
/**
* Support knn function
*/
KNN_FUNCTION(Build.current().isSnapshot()),
KNN_FUNCTION_V2(Build.current().isSnapshot()),

LIKE_WITH_LIST_OF_PATTERNS,

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ private static List<NamedWriteableRegistry.Entry> fullText() {
}

private static List<NamedWriteableRegistry.Entry> vector() {
if (EsqlCapabilities.Cap.KNN_FUNCTION.isEnabled()) {
if (EsqlCapabilities.Cap.KNN_FUNCTION_V2.isEnabled()) {
return List.of(Knn.ENTRY);
}
return List.of();
Expand Down
Loading
Loading