Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synthetic _source can't be retrieved with overlapping keys in flattened field #122936

Open
flash1293 opened this issue Feb 19, 2025 · 5 comments
Open
Assignees
Labels
>bug :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine

Comments

@flash1293
Copy link
Contributor

flash1293 commented Feb 19, 2025

Elasticsearch Version

main

Installed Plugins

No response

Java Version

bundled

OS Version

macOS 15.3

Problem Description

When using flattened fields in a logsdb index, synthetic source is used to reconstruct the value of the flattened field. In case the key/value pairs in the flattened field have an object/scalar mismatch, the search request fails because the _source can't be constructed.

Steps to Reproduce

PUT my-index
{
  "settings": {
    "mode": "logsdb"
  },
  "mappings": {
    "properties": {
        "test": {
            "type": "flattened"
        }
    }
  }
}

POST my-index/_doc
{
  "@timestamp": "2025",
  "test": {
      "nested.doublynested": 123,
      "nested": {
        "doublynested": {
          "gotcha": true
        }
      }
    }
}

GET my-index/_search // returns 500

Logs (if relevant)

failure encoding chunk com.fasterxml.jackson.core.JsonParseException: Duplicate field 'doublynested'
   │       at [Source: (org.elasticsearch.common.io.stream.ByteBufferStreamInput); line: 1, column: 112]
   │      	at [email protected]/com.fasterxml.jackson.core.json.JsonReadContext._checkDup(JsonReadContext.java:250)
   │      	at [email protected]/com.fasterxml.jackson.core.json.JsonReadContext.setCurrentName(JsonReadContext.java:244)
   │      	at [email protected]/com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:758)
   │      	at [email protected]/com.fasterxml.jackson.core.JsonGenerator._copyCurrentContents(JsonGenerator.java:2657)
   │      	at [email protected]/com.fasterxml.jackson.core.JsonGenerator.copyCurrentStructure(JsonGenerator.java:2638)
   │      	at [email protected]/org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.copyCurrentStructure(JsonXContentGenerator.java:540)
   │      	at [email protected]/org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.writeRawField(JsonXContentGenerator.java:475)
   │      	at [email protected]/org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.writeRawField(JsonXContentGenerator.java:466)
   │      	at [email protected]/org.elasticsearch.xcontent.XContentBuilder.rawField(XContentBuilder.java:1205)
   │      	at [email protected]/org.elasticsearch.common.xcontent.XContentHelper.writeRawField(XContentHelper.java:578)
   │      	at [email protected]/org.elasticsearch.search.SearchHit.toInnerXContent(SearchHit.java:856)
   │      	at [email protected]/org.elasticsearch.search.SearchHit.toXContent(SearchHit.java:801)
   │      	at [email protected]/org.elasticsearch.rest.ChunkedRestResponseBodyPart$1.encodeChunk(ChunkedRestResponseBodyPart.java:161)
   │      	at [email protected]/org.elasticsearch.rest.RestController$EncodedLengthTrackingChunkedRestResponseBodyPart.encodeChunk(RestController.java:1002)
   │      	at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.writeChunk(Netty4HttpPipeliningHandler.java:440)
   │      	at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWriteChunkedResponse(Netty4HttpPipeliningHandler.java:267)
   │      	at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.doWrite(Netty4HttpPipeliningHandler.java:235)
   │      	at [email protected]/org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.write(Netty4HttpPipeliningHandler.java:188)
   │      	at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:891)
   │      	at [email protected]/io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:956)
   │      	at [email protected]/io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1263)
   │      	at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
   │      	at [email protected]/io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
   │      	at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
   │      	at [email protected]/io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
   │      	at [email protected]/io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
   │      	at [email protected]/io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
   │      	at java.base/java.lang.Thread.run(Thread.java:1575)
@flash1293 flash1293 added >bug needs:triage Requires assignment of a team area label labels Feb 19, 2025
@kkrik-es kkrik-es added Team:StorageEngine :StorageEngine/Mapping The storage related side of mappings labels Feb 19, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Feb 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@lkts lkts changed the title Logsdb _source can't be retrieved with overlapping keys in flattened field Synthetic _source can't be retrieved with overlapping keys in flattened field Feb 19, 2025
@lkts
Copy link
Contributor

lkts commented Feb 19, 2025

When i test this locally i get this as _source instead of an error for some reason.

"test": {
    "nested": {
        "doublynested": {
            "gotcha": "true"
        }
    }
}

@lkts
Copy link
Contributor

lkts commented Feb 19, 2025

What is the commit you are testing with?

@lkts
Copy link
Contributor

lkts commented Feb 19, 2025

We don't know if a field inside flattened object was "flat" (with dots in the name) or not. We only see the name and have to assume it was an object structure.

This is what is produced as synthetic source before it fails/transforms somewhere later. It does not really make sense.

{
    "@timestamp": "2025-01-01T00:00:00.000Z",
    "test": {
        "nested": {
            "doublynested": "123",
            "doublynested": {
                "gotcha": "true"
            }
        }
    }
}

Is this something you need to rely on? IMO the best we can do now is to document this as not supported. I don't see an easy way to fix this and it is not obvious (at least to me) if this needs to be supported at all.

@flash1293
Copy link
Contributor Author

This is the version I'm using:

{
  "name": "Joes-MacBook-Pro.local",
  "cluster_name": "elasticsearch",
  "cluster_uuid": "P1VQxqR9TIGGumh3wwYTPQ",
  "version": {
    "number": "9.1.0-SNAPSHOT",
    "build_flavor": "default",
    "build_type": "tar",
    "build_hash": "03271bdfe8b0e469a87fd9be9f8ca1053d7a33d4",
    "build_date": "2025-02-07T03:05:27.318109504Z",
    "build_snapshot": true,
    "lucene_version": "10.1.0",
    "minimum_wire_compatibility_version": "8.19.0",
    "minimum_index_compatibility_version": "8.0.0"
  },
  "tagline": "You Know, for Search"
}

I'm running it as part of the Kibana dev setup (not sure whether that matters)

Is this something you need to rely on? IMO the best we can do now is to document this as not supported. I don't see an easy way to fix this and it is not obvious (at least to me) if this needs to be supported at all.

I don't have a use case for this right now, it's OK for me if not supported. IMHO we shouldn't throw on this, but maybe that's something that already works and my local version is just a little stale.

I could imagine fixing this by detecting the case and flattening fields out only if necessary. But as mentioned, probably not worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine
Projects
None yet
Development

No branches or pull requests

4 participants