Skip to content

Flattened fields have scalar/object mismatch is keys starts with non-letter characters #130395

Open
@parkertimmins

Description

@parkertimmins

Elasticsearch Version

main (9.1.0)

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 6.8.0-60-generic #63~22.04.1-Ubuntu

Problem Description

#129600 fixed an issue where a flattened field produced incorrect synthetic source when a key had both a scalar and an object value.

Unfortunately, there still exist situations that this bug can occur.

If one key is a single . and another is a character that is lexicographically before ., like #.
For example, consider the key/value pairs:

a.b.#|5
a.b..|10

Because # comes before ., this is the order that the flattened key value pairs will be sorted.

When a.b.. is parsed into parts, the trailing dots will be removed, so the parsed path is [a, b]. (Because String.split does not produce empty strings between trailing delimiters).

But a.b.# is parsed as [a, b, #]. Thus we have a a scalar object mismatch. b has a scalar value of 10, and an object value of {#: 5}. But because the key/value pair with the object comes first, the existing logic does not handle this correctly.

One options would be to use a negative value for limit in String.split("\\.", limit) when parsing the path. This would cause any trailing dots to be assumed to be separated by empty string keys. Though this is likely not the intended behavior, it would be consistent which how dot's at the beginning of path are handled.

Steps to Reproduce

The following reproduces the issue:


curl -X PUT "localhost:9200/my-index" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "mode": "logsdb",
    "index.mapping.source.mode": "synthetic"
  },
  "mappings": {
    "properties": {
        "test": {
            "type": "flattened"
        }
    }
  }
}
' | jq


curl -X POST "localhost:9200/my-index/_doc" -H 'Content-Type: application/json' -d'
{
  "@timestamp": "2025",
  "test": {
    "a.#": 5,
    "a..": 10
  } 
}
' | jq

curl localhost:9200/my-index/_search | jq

This returns the value:

 "test": {
            "a": "10"
          }

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions