-
Notifications
You must be signed in to change notification settings - Fork 464
box_events: fix handling of large cursor offsets #14319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When a cursor stream offset is large — at least 1e6, the template renders the value in e-notation. This is a consequence of the cursor being stored as JSON and so being contaminated by JS number semantics. Another threshold exists at 0x1p53 (4.5e15) where we lose exact integer representation. We do see values as large as 3.0e16, so we are beyond this value and cannot rely on numeric value representation at all. This is exacerbated by the fact that the input converts from string to integer values via float64. To resolve this, explicitly convert the offset to an integer when rendering the value into the parameter, and accept that we may either recollect or miss documents from the API.
🚀 Benchmarks reportTo see the full report comment with |
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
@@ -9,7 +9,8 @@ vars: | |||
# correspond to data_stream | |||
data_stream: | |||
vars: | |||
interval: 10s | |||
stream_type: 'all' | |||
enable_request_tracer: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move enable_request_tracer
to be a child of vars
instead data_stream.vars
. Currently it's not being honored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not identified by ep? ISTM it is something that could (probably does) happen regularly without mechanical support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relates: elastic/elastic-package#2186
"type": "event" | ||
} | ||
], | ||
"next_stream_position": 2152922976252290800 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the request tracer logs I see ?stream_position=2152922976252290816
so I think we have lost precision.
{"log.level":"debug","@timestamp":"2025-06-27T18:08:05.810Z","message":"HTTP request","transaction.id":"HN21F8SIV161G-5","url.original":"http://svc-box-http:8080/2.0/events?stream_position=2152922976252290816&stream_type=all","url.scheme":"http","url.path":"/2.0/events","url.domain":"svc-box-http","url.port":"8080","url.query":"stream_position=2152922976252290816&stream_type=all","http.request.method":"GET","http.request.header":{"Accept":["application/json"],"Authorization":["Bearer c3FIOG9vSGV4VHo4QzAyg5T1JvNnJoZ3ExaVNyQWw6WjRsanRKZG5lQk9qUE1BVQ"],"User-Agent":["Elastic-Filebeat/8.18.2 (linux; arm64; 2651640ff23044732e551dd9139a298e0f833ac1; 2025-05-22 17:09:10 +0000 UTC)"]},"user_agent.original":"Elastic-Filebeat/8.18.2 (linux; arm64; 2651640ff23044732e551dd9139a298e0f833ac1; 2025-05-22 17:09:10 +0000 UTC)","http.request.body.content":"","http.request.body.truncated":false,"http.request.body.bytes":0,"http.request.mime_type":"","ecs.version":"1.6.0"}
This cursor on disk has:
{"k":"httpjson::httpjson-box_events.events-20eb7aed-40ef-4cca-bccb-d27053fcd2dc::http://svc-box-http:8080/2.0/events","v":{"ttl":1800000000000,"updated":[809677759,1751046234],"cursor":{"next_stream_position":"2.1529229762522908e+18"}}}
So I assume that httpjson
is not unmarshaling with json.UseNumber
. Without using json.Number
and avoiding the number -> float64 -> int64 conversion, I'm not sure we can fix this with configuration only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the fix here is a reasonably non-invasive fix to something that is the consequence of some quite unfortunate decisions that are spread throughout the agent, the JSON serialisation spec and the data source. This is all discussed in the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better than it was.
I think an input change will be necessary to avoid the mandatory conversion to float64 so that we can pass through the next_stream_position
as the literal text of the number.
💚 Build Succeeded
History
cc @efd6 |
|
Package box_events - 2.14.1 containing this change is available at https://epr.elastic.co/package/box_events/2.14.1/ |
When a cursor stream offset is large — at least 1e6, the template renders the value in e-notation. This is a consequence of the cursor being stored as JSON and so being contaminated by JS number semantics. Another threshold exists at 0x1p53 (4.5e15) where we lose exact integer representation. We do see values as large as 3.0e16, so we are beyond this value and cannot rely on numeric value representation at all. This is exacerbated by the fact that the input converts from string to integer values via float64. To resolve this, explicitly convert the offset to an integer when rendering the value into the parameter, and accept that we may either recollect or miss documents from the API.
Proposed commit message
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Related issues
Screenshots