forked from redpanda-data/redpanda
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] dev from redpanda-data:dev #39
Open
pull
wants to merge
10,000
commits into
Mu-L:dev
Choose a base branch
from
redpanda-data:dev
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Current version fails to build on Fedora 41. Co-authored-by: Noah Watkins <[email protected]>
cloud_storage: Move s3_imposter to cloud_io
CORE-8937 admin: swagger docs for patch cluster config body
bazel: Update krb5
Check that with redpanda.iceberg.delete=false old table data remains available even before we recreate the topic.
And switch back to normal admin after disruptions are over.
add log lines, fix typos
if we unmount the topic before this table may lack metadata
Introduce "offline mode" that cuts all ties to the topic in Redpanda cluster. It carries on querying the query engine and verifying results using info cached before going into offline mode.
for to make functionality is tested while topic is being actively used
Make it possible to configure the number of messages produced by stream
Add scenarios: 1) On unmount all messages that made their way to the topic eventually become available via query engine 2) Upon remount and further produce both old and new messages are in the topic and in the table
to prevent archiver shutdown while waiting
This is mostly to preserve iceberg properties, but also to make sure any newly introduced topic properties are preserved by default.
This is mostly to preserve iceberg properties, but also to make sure any newly introduced topic properties are preserved by default.
Allows to use it for subscriptions where feedback from a called function is necessary, such as a future or an error code. All functions are supposed to return the same type.
Make offset_monitor more universal so that it can be used for different data types.
Also create and subscribe one of these actions: flush data to cloud.
Adds the book-keeping variables `_dirty/closed_segment_bytes` to `disk_log_impl`, as well as some getter/setter functions. These functions will be used throughout `disk_log_impl` where required (segment rolling, compaction, segment eviction) to track the bytes contained in dirty and closed segments.
Uses the added functions `update_dirty/closed_segment_bytes()` in the required locations within `disk_log_impl` in order to bookkeep the dirty ratio. Bytes can be either removed or added by rolling new segments, compaction, and retention enforcement.
We were missing a type check in `_make_java_properties()` that would cause CDT to fail for other cloud storage providers.
pandaproxy/sr: Relax restrictions for bundled json schemas
Update and enable large_messages_test (LMT)
This seems to already happen and isn't needed.
Sort of amazing, but I found this and it helps improve caching because debug symbols are now relative to the redpanda repo instead of using absolute paths. I've been using this for a few days and it's been great.
`parse_rest_error_response` method tries to read fields from xml response and constructs `rest_error_response`. If a field is not found or is empty then it defaults to empty string. https://github.com/redpanda-data/redpanda/blob/d3c2f00c4071c2cbce1e1babdfc2291e3c9898ba/src/v/cloud_storage_clients/s3_client.cc#L411 Google Cloud Storage gives one of these replies which we parse but the Error.Code path is not present in the response and we trip with a bad lexical cast which results in an error log line. With this commit we'll default to unknown error code in that case. We already do the same for codes we don't recognize in the operator>>. lexical_cast does not call the operator>> at all for empty strings.
In CI this test was pretty close to timeout before switching to debug mode seastar. Now it became flaky. Increase the timeout.
bazel: use relative paths for debug symbols
…est-bump-timeout
`storage`: book keep `dirty_ratio` in `disk_log_impl`
rpk/chore: Bump Go dependencies.
The Java implementation expects snapshot removal updates to be serialized one at a time[1]. This meant that with Java REST catalogs, we could trigger catalog-side errors like: 2025-02-10T19:09:53.865 WARN [org.eclipse.jetty.server.HttpChannel] - /v1/namespaces/redpanda/tables/test java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove. Expected one value but received: [2205058756266803285, 6389287107599228031, 837858603806954013, 8429544376017231169] at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:220) at org.apache.iceberg.MetadataUpdateParser.readRemoveSnapshots(MetadataUpdateParser.java:530) at org.apache.iceberg.MetadataUpdateParser.fromJson(MetadataUpdateParser.java:300) at org.apache.iceberg.rest.requests.UpdateTableRequestParser.lambda$fromJson$2(UpdateTableRequestParser.java:105) at java.base/java.lang.Iterable.forEach(Iterable.java:75) at org.apache.iceberg.rest.requests.UpdateTableRequestParser.fromJson(UpdateTableRequestParser.java:105) at org.apache.iceberg.rest.RESTSerializers$UpdateTableRequestDeserializer.deserialize(RESTSerializers.java:354) at org.apache.iceberg.rest.RESTSerializers$UpdateTableRequestDeserializer.deserialize(RESTSerializers.java:349) at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323) at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4825) at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3785) at org.apache.iceberg.rest.RESTCatalogServlet$ServletRequestContext.from(RESTCatalogServlet.java:179) at org.apache.iceberg.rest.RESTCatalogServlet.doPost(RESTCatalogServlet.java:78) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:554) This updates the snapshot removal action to return its removed snapshots individually. Note that this doesn't mean there are multiple calls to the catalog for removal, only that we serialize multiple lists each with a single removal instead of a single list with multiple removals. [1] https://github.com/apache/iceberg/blob/3e6da2e5437ffb3f643275927e5580cb9620256b/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L550-L553
Parameterizes test_remove_expired_snapshots on catalog type. This would have caught an incompatibility in which we serialized multiple snapshots per removal update, when the Java impl expected multiple removal updates, each with single snapshot[1]. This required a change to the Spark call to see snapshots, since Spark's REST catalog integration doesn't appear to support the "{table_name}.snapshot" system table. [1] https://github.com/apache/iceberg/blob/3e6da2e5437ffb3f643275927e5580cb9620256b/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L550-L553
…ing-conventions Updating search filter based on new naming conventions
We added a batch_size metric but it has a lantecy_metric label which doesn't make sense for this metric. Remove it.
iceberg: serialize snapshot removals individually
cloud_storage_clients: skip lexical_cast on empty strings
Fixing check azure instances output parsing
kafka-probe: remove latency label on batch_size
Because we now schedule adjacent segment compaction after sliding window compaction, this test was having trouble trying to reach the desired number of segments while producing. Increase the timeout as well as the `log_compaction_interval_ms` to allow the test to reach the desired number of segments.
`rptest`: add credentials type check in `NessieService`
The `datalake_staging` folder was recently moved under the redpanda data directory. It should not be considered as a namespace in `compute_size()`.
…_fix [CORE-8848] `rptest`: adjust compaction settings in `datalake/compaction_test`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )