Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] dev from redpanda-data:dev #585

Merged
merged 5 commits into from
Feb 11, 2025
Merged

[pull] dev from redpanda-data:dev #585

merged 5 commits into from
Feb 11, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 11, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

nvartolomei and others added 5 commits February 7, 2025 22:01
`parse_rest_error_response` method tries to read fields from xml response
and constructs `rest_error_response`. If a field is not found or is
empty then it defaults to empty string.

https://github.com/redpanda-data/redpanda/blob/d3c2f00c4071c2cbce1e1babdfc2291e3c9898ba/src/v/cloud_storage_clients/s3_client.cc#L411

Google Cloud Storage gives one of these replies which we parse but the
Error.Code path is not present in the response and we trip with a bad
lexical cast which results in an error log line.

With this commit we'll default to unknown error code in that case. We
already do the same for codes we don't recognize in the operator>>.
lexical_cast does not call the operator>> at all for empty strings.
The Java implementation expects snapshot removal updates to be
serialized one at a time[1]. This meant that with Java REST catalogs, we
could trigger catalog-side errors like:

2025-02-10T19:09:53.865 WARN  [org.eclipse.jetty.server.HttpChannel] - /v1/namespaces/redpanda/tables/test
java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove.
Expected one value but received: [2205058756266803285, 6389287107599228031, 837858603806954013, 8429544376017231169]
    at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:220)
    at org.apache.iceberg.MetadataUpdateParser.readRemoveSnapshots(MetadataUpdateParser.java:530)
    at org.apache.iceberg.MetadataUpdateParser.fromJson(MetadataUpdateParser.java:300)
    at org.apache.iceberg.rest.requests.UpdateTableRequestParser.lambda$fromJson$2(UpdateTableRequestParser.java:105)
    at java.base/java.lang.Iterable.forEach(Iterable.java:75)
    at org.apache.iceberg.rest.requests.UpdateTableRequestParser.fromJson(UpdateTableRequestParser.java:105)
    at org.apache.iceberg.rest.RESTSerializers$UpdateTableRequestDeserializer.deserialize(RESTSerializers.java:354)
    at org.apache.iceberg.rest.RESTSerializers$UpdateTableRequestDeserializer.deserialize(RESTSerializers.java:349)
    at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4825)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3785)
    at org.apache.iceberg.rest.RESTCatalogServlet$ServletRequestContext.from(RESTCatalogServlet.java:179)
    at org.apache.iceberg.rest.RESTCatalogServlet.doPost(RESTCatalogServlet.java:78)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:554)

This updates the snapshot removal action to return its removed snapshots
individually. Note that this doesn't mean there are multiple calls to
the catalog for removal, only that we serialize multiple lists each with
a single removal instead of a single list with multiple removals.

[1] https://github.com/apache/iceberg/blob/3e6da2e5437ffb3f643275927e5580cb9620256b/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L550-L553
Parameterizes test_remove_expired_snapshots on catalog type. This would
have caught an incompatibility in which we serialized multiple snapshots
per removal update, when the Java impl expected multiple removal
updates, each with single snapshot[1].

This required a change to the Spark call to see snapshots, since Spark's
REST catalog integration doesn't appear to support the
"{table_name}.snapshot" system table.

[1] https://github.com/apache/iceberg/blob/3e6da2e5437ffb3f643275927e5580cb9620256b/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L550-L553
iceberg: serialize snapshot removals individually
cloud_storage_clients: skip lexical_cast on empty strings
@pull pull bot added the ⤵️ pull label Feb 11, 2025
@pull pull bot merged commit eebad47 into kokizzu:dev Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants