You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been using the latest edge and starting to see errors on the indexers
starting to see a lot of
quickwit-indexer-4 quickwit 2025-01-20T19:26:37.105Z WARN quickwit_ingest::ingest_v2::ingester: failed to persist records to ingester `quickwit-indexer-4`: write-ahead log memory buffer is full: capacity: 6.4 GB, usage: 6.4 GB, requested: 10.2 MB
which seem to trigger these
quickwit-indexer-4 quickwit 2025-01-20T19:26:49.270Z WARN quickwit_ingest::ingest_v2::router: failed to persist records on ingester `quickwit-indexer-2`: too many requests
quickwit-indexer-4 quickwit 2025-01-20T19:26:49.303Z ERROR quickwit_ingest::ingest_v2::router: failed to persist records on ingester `quickwit-indexer-1`: too many requests
quickwit-indexer-4 quickwit 2025-01-20T19:26:49.738Z WARN quickwit_ingest::ingest_v2::router: failed to persist records on ingester `quickwit-indexer-1`: too many requests
quickwit-indexer-4 quickwit 2025-01-20T19:26:49.794Z WARN quickwit_ingest::ingest_v2::router: failed to persist records on ingester `quickwit-indexer-2`: too many requests
Also see these with indexes
quickwit-indexer-3 quickwit 2025-01-20T20:16:54.033Z WARN quickwit_indexing::actors::merge_planner: Rebuilding the known split ids set ended up not halving its size. Please report. This is likely a bug, please report. known_split_ids_len_after=286 known_split_ids_len_before=355
@mzupan if the WAL is full it likely means that indexing cannot keep up with your ingestion rate. I don't think you mentioned the resource spec of your indexers (except that there are 6 of them) , but 761MB/s is a pretty high throughput 😅 . As a rule of thumb, we usually estimate the indexing rate at around 7.5MB/s/core (see sizing docs).
Our WAL implementation is such that we only accept records if we have space in memory and on disk.
6GB / (761 / 6) = 46 seconds on average to fill up the WAL memory buffer of an indexer, but your commit timeout is 60 seconds. Increase the WAL memory buffer size (max_queue_memory_usage) to 8 GB or decrease your commit timeout to 45s.
I've been using the latest edge and starting to see errors on the indexers
starting to see a lot of
which seem to trigger these
Also see these with indexes
and
My configmap is
I do have a bunch of indexes which are all basically the same with dynamic mapping
doc mapping
index settings
There are 6 indexes that get a decent amount of traffic. Around
761 MB/s
and203619 docs/s
Are things too overloaded?
On my disks.. nothing jumps out at me that the EBS disks are IO bound or max IOPs. I've tried giving more throughput and IOPs with no help.
I'm using
The text was updated successfully, but these errors were encountered: