Skip to content

Don't normalize fields of type text when the index mode is LogsDB or TSDB #131317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Kubik42
Copy link

@Kubik42 Kubik42 commented Jul 15, 2025

This changes the default behavior for norms on text fields in logsdb and tsdb indices. Prior to this change, norms were enabled by default, with the option to disable them via manual configurations. After this change, norms will be disabled by default. Note, because we don't support enabling norms from a disabled state (code), unless manually enabled from the get-go, users will not be able to enable norms on text fields in logsdb and tsdb indices.

In the context of logsdb and tsdb, not many fields are configured to be of type text, and given the trade offs that logsdb and tsdb provide that is geared towards storage reduction, it makes sense not to store a normalized version of text fields.

Closes #129183

During benchmarking, there was a minor improvement in disk usage:

|         Metric |        Baseline    |        Contender |              Diff |   Unit |    Diff % |
|---------------:|-------------------:|-----------------:|------------------:|-------:|----------:|
|  Dataset size  |       91.4448      |     89.1111      |      -2.33367     |     GB |    -2.55% |
|    Store size  |       91.4448      |     89.1111      |      -2.33367     |     GB |    -2.55% |

More benchmarks and details available here.

With this change, we're also replacing the existing TextParms.norms with Parameter.normsParam to be more in line with the existing functions in that class. This change is covered by the existing tests.

…TSDB.

In the context of logsdb and tsdb, not many fields are configured to be of type text,
and given the trade offs that logsdb and tsdb provide that is geared towards storage
reduction, it makes sense not to store a normalized version of text fields.

Closes elastic#129183
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @Kubik42, I've created a changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@elasticsearchmachine
Copy link
Collaborator

Hi @Kubik42, I've updated the changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

@elasticsearchmachine
Copy link
Collaborator

Hi @Kubik42, I've updated the changelog YAML for you. Note that since this PR is labelled >breaking, you need to update the changelog YAML to fill out the extended information sections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disable norms by default on text field when logsdb/tsdb enabled
2 participants