Description
Elasticsearch Version
main (9.1.0)
Installed Plugins
No response
Java Version
bundled
OS Version
Linux 6.8.0-60-generic #63~22.04.1-Ubuntu
Problem Description
#129600 fixed an issue where a flattened field produced incorrect synthetic source when a key had both a scalar and an object value.
Unfortunately, there still exist situations that this bug can occur.
If one key is a single .
and another is a character that is lexicographically before .
, like #
.
For example, consider the key/value pairs:
a.b.#|5
a.b..|10
Because #
comes before .
, this is the order that the flattened key value pairs will be sorted.
When a.b..
is parsed into parts, the trailing dots will be removed, so the parsed path is [a, b]
. (Because String.split does not produce empty strings between trailing delimiters).
But a.b.#
is parsed as [a, b, #]
. Thus we have a a scalar object mismatch. b
has a scalar value of 10, and an object value of {#: 5}
. But because the key/value pair with the object comes first, the existing logic does not handle this correctly.
One options would be to use a negative value for limit in String.split("\\.", limit)
when parsing the path. This would cause any trailing dots to be assumed to be separated by empty string keys. Though this is likely not the intended behavior, it would be consistent which how dot's at the beginning of path are handled.
Steps to Reproduce
The following reproduces the issue:
curl -X PUT "localhost:9200/my-index" -H 'Content-Type: application/json' -d'
{
"settings": {
"mode": "logsdb",
"index.mapping.source.mode": "synthetic"
},
"mappings": {
"properties": {
"test": {
"type": "flattened"
}
}
}
}
' | jq
curl -X POST "localhost:9200/my-index/_doc" -H 'Content-Type: application/json' -d'
{
"@timestamp": "2025",
"test": {
"a.#": 5,
"a..": 10
}
}
' | jq
curl localhost:9200/my-index/_search | jq
This returns the value:
"test": {
"a": "10"
}
Logs (if relevant)
No response