-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[feature](search) add another two params for search #57312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for default field and default operator parameters to the search DSL functionality, enabling simplified query syntax where field names and boolean operators can be implied rather than explicitly specified.
Key Changes:
- Extended the
search()function to accept 1-3 parameters (previously only 1) - Added DSL expansion logic to automatically prefix field names when a default field is provided
- Implemented default operator support ("and"/"or") for multi-term queries
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| Search.java | Updated function signatures to support optional default_field and default_operator parameters; added getter methods for new parameters |
| SearchDslParser.java | Implemented DSL expansion logic including field prefix injection, operator normalization, and term tokenization |
| SearchDslParserTest.java | Added comprehensive test coverage for default field/operator scenarios including edge cases and validation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| private static String normalizeDefaultOperator(String operator) { | ||
| if (operator == null || operator.trim().isEmpty()) { | ||
| return "or"; // Default to OR | ||
| } | ||
| String normalized = operator.trim().toLowerCase(); | ||
| if ("and".equals(normalized) || "or".equals(normalized)) { | ||
| return normalized; | ||
| } | ||
| throw new IllegalArgumentException("Invalid default operator: " + operator | ||
| + ". Must be 'and' or 'or'"); | ||
| } |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The null/empty check and trimming logic is duplicated. Consider extracting the trimming and null handling: String trimmed = (operator == null) ? \"\" : operator.trim(); if (trimmed.isEmpty()) return \"or\"; This reduces redundancy and makes the intent clearer.
| if (upperRemaining.startsWith("AND ") || upperRemaining.startsWith("AND\t") | ||
| || (upperRemaining.equals("AND") && i + 3 >= dsl.length())) { |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The operator matching logic for AND, OR, and NOT (lines 269-317) contains duplicated patterns. Consider extracting a helper method matchOperator(String remaining, String operator, int length) to reduce code duplication and improve maintainability.
| return upper.matches(".*\\s+(AND|OR)\\s+.*") | ||
| || upper.matches("^NOT\\s+.*") | ||
| || upper.matches(".*\\s+NOT\\s+.*"); |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple regex matches on the same string can be inefficient. Consider using a single compiled Pattern or combining these patterns into one regex: Pattern.matches(\"^NOT\\s+.*|.*\\s+(AND|OR|NOT)\\s+.*\", upper) to improve performance for frequently called code.
| // for incomplete DSL like "foo" (no field specified) | ||
| String dsl = "foo"; | ||
|
|
||
| // This should throw an exception because "foo" alone is not valid DSL |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is misleading. The test expects an exception not because \"foo\" is invalid DSL in general, but because with an empty default field, the expansion fails. Clarify that the exception occurs due to the empty default field preventing proper DSL expansion.
| // This should throw an exception because "foo" alone is not valid DSL | |
| // This should throw an exception because with an empty default field, the parser cannot expand "foo" into a valid fielded query |
TPC-DS: Total hot run time: 189640 ms |
ClickBench: Total hot run time: 27.7 s |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
### What problem does this PR solve?
Problem Summary:
* Extend `search()` to accept **1–3 arguments**:
1. `query`: the original search DSL string (unchanged).
2. `default_field` *(optional)*: field name applied to unqualified terms
in `query`.
3. `default_operator` *(optional)*: boolean operator for multi-term
queries; accepts `"and"` or `"or"` (case-insensitive).
* If omitted or empty, the operator defaults to **`or`**.
* Parser updates:
* When `default_field` is provided, unqualified terms in `query` are
automatically rewritten to `default_field:term`.
* `default_operator` is validated and normalized; invalid values produce
a clear error.
### Example
**Before** (must qualify every term or rely on engine defaulting rules):
```sql
SELECT *
FROM t
WHERE search('title:payment title:timeout')
```
**After** (use `default_field = "title"` and `default_operator =
"and"`):
```sql
SELECT *
FROM t
WHERE search('payment timeout', 'title', 'and');
```
This evaluates as if the query were `title:payment AND title:timeout`.
#57324) Cherry-picked from #57312 Co-authored-by: Jack <[email protected]>
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Extend
search()to accept 1–3 arguments:query: the original search DSL string (unchanged).default_field(optional): field name applied to unqualified terms inquery.default_operator(optional): boolean operator for multi-term queries; accepts"and"or"or"(case-insensitive).or.Parser updates:
default_fieldis provided, unqualified terms inqueryare automatically rewritten todefault_field:term.default_operatoris validated and normalized; invalid values produce a clear error.Example
Before (must qualify every term or rely on engine defaulting rules):
After (use
default_field = "title"anddefault_operator = "and"):This evaluates as if the query were
title:payment AND title:timeout.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)