Skip to content

Add more info re: WAL failover probe file, logging #20129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions src/current/_includes/v25.3/wal-failover-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@ On a CockroachDB [node]({% link {{ page.version.version }}/architecture/overview

Failing over the WAL may allow some operations against a store to continue to complete despite temporary unavailability of the underlying storage. For example, if the node's primary store is stalled, and the node can't read from or write to it, the node can still write to the WAL on another store. This can allow the node to continue to service requests during momentary unavailability of the underlying storage device.

When WAL failover is enabled, CockroachDB will take the the following actions:
When WAL failover is enabled, CockroachDB:

- At node startup, each store is assigned another store to be its failover destination.
- CockroachDB will begin monitoring the latency of all WAL writes. If latency to the WAL exceeds the value of the [cluster setting `storage.wal_failover.unhealthy_op_threshold`]({% link {{page.version.version}}/cluster-settings.md %}#setting-storage-wal-failover-unhealthy-op-threshold), the node will attempt to write WAL entries to a secondary store's volume.
- CockroachDB will update the [store status endpoint]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#store-status-endpoint) at `/_status/stores` so you can monitor the store's status.
- Pairs each primary store with a secondary failover store at node startup.
- Monitors latency of all write operations against the primary WAL. If any operation exceeds [`storage.wal_failover.unhealthy_op_threshold`]({% link {{page.version.version}}/cluster-settings.md %}#setting-storage-wal-failover-unhealthy-op-threshold), the node redirects new WAL writes to the secondary store.
- Checks the primary store while failed over by performing a set of filesystem operations against a small internal 'probe file' on its volume. This file contains no user data and exists only when WAL failover is enabled.
- Switches back to the primary store once the set of filesystem operations against the probe file on its volume starts consuming less than a latency threshold (order of 10s of milliseconds). If a probe `fsync` blocks longer than [`COCKROACH_ENGINE_MAX_SYNC_DURATION_DEFAULT`]({% link {{ page.version.version }}/wal-failover.md %}#important-environment-variables), CockroachDB emits a log like: `disk stall detected: sync on file probe-file has been ongoing for 40.0s` and, if the stall persists, the node exits (fatals) to [shed leases]({% link {{ page.version.version }}/architecture/replication-layer.md %}#how-leases-are-transferred-from-a-dead-node) and allow recovery elsewhere.
- Exposes status at [`/_status/stores`]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#store-status-endpoint) so you can monitor each store's health and failover state.

{{site.data.alerts.callout_info}}
- WAL failover only relocates the WAL. Data files remain on the primary volume. Reads that miss the Pebble block cache and the OS page cache can still stall if the primary disk is stalled; caches typically limit blast radius, but some reads may see elevated latency.
{{site.data.alerts.end}}
6 changes: 1 addition & 5 deletions src/current/v25.3/cockroach-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Flag | Description
<a name="flags-max-tsdb-memory"></a>`--max-tsdb-memory` | Maximum memory capacity available to store temporary data for use by the time-series database to display metrics in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}). Consider raising this value if your cluster is comprised of a large number of nodes where individual nodes have very limited memory available (e.g., under `8 GiB`). Insufficient memory capacity for the time-series database can constrain the ability of the DB Console to process the time-series queries used to render metrics for the entire cluster. This capacity constraint does not affect SQL query execution. This flag accepts numbers interpreted as bytes, size suffixes (e.g., `1GB` and `1GiB`) or a percentage of physical memory (e.g., `0.01`).<br><br>**Note:** The sum of `--cache`, `--max-sql-memory`, and `--max-tsdb-memory` should not exceed 75% of the memory available to the `cockroach` process.<br><br>**Default:** `0.01` (i.e., 1%) of physical memory or `64 MiB`, whichever is greater.
`--pid-file` | The file to which the node's process ID will be written as soon as the node is ready to accept connections. When `--background` is used, this happens before the process detaches from the terminal. When this flag is not set, the process ID is not written to file.
<a name="flags-store"></a> `--store`<br>`-s` | The file path to a storage device and, optionally, store attributes and maximum size. When using multiple storage devices for a node, this flag must be specified separately for each device, for example: <br><br>`--store=/mnt/ssd01 --store=/mnt/ssd02` <br><br>For more details, see [Store](#store) below.
`--wal-failover` <a name="flag-wal-failover"></a> | Used to configure [WAL failover](#write-ahead-log-wal-failover) on [nodes]({% link {{ page.version.version }}/architecture/overview.md %}#node) with [multiple stores](#store). To enable WAL failover, pass `--wal-failover=among-stores`. To disable, pass `--wal-failover=disabled` on [node restart]({% link {{ page.version.version }}/node-shutdown.md %}#stop-and-restart-a-node). This feature is in [preview]({% link {{page.version.version}}/cockroachdb-feature-availability.md %}#features-in-preview).
`--wal-failover` <a name="flag-wal-failover"></a> | Used to configure [WAL failover](#write-ahead-log-wal-failover) on [nodes]({% link {{ page.version.version }}/architecture/overview.md %}#node) with [multiple stores](#store). To enable WAL failover, pass `--wal-failover=among-stores`. To disable, pass `--wal-failover=disabled` on [node restart]({% link {{ page.version.version }}/node-shutdown.md %}#stop-and-restart-a-node).
<a name="flags-spatial-libs"></a>`--spatial-libs` | The location on disk where CockroachDB looks for [spatial]({% link {{ page.version.version }}/spatial-data-overview.md %}) libraries.<br/><br/>**Defaults:** <br/><ul><li>`/usr/local/lib/cockroach`</li><li>A `lib` subdirectory of the CockroachDB binary's current directory.</li></ul>
`--temp-dir` <a name="temp-dir"></a> | The path of the node's temporary store directory. On node start up, the location for the temporary files is printed to the standard output. <br><br>**Default:** Subdirectory of the first [store](#store)

Expand Down Expand Up @@ -237,10 +237,6 @@ Field | Description

{% include {{ page.version.version }}/wal-failover-intro.md %}

{{site.data.alerts.callout_info}}
{% include feature-phases/preview.md %}
{{site.data.alerts.end}}

This page has basic instructions on how to enable WAL failover, disable WAL failover, and monitor WAL failover.

For more detailed instructions showing how to use, test, and monitor WAL failover, as well as descriptions of how WAL failover works in multi-store configurations, see [WAL Failover]({% link {{ page.version.version }}/wal-failover.md %}).
Expand Down
Loading