Skip to content

doc: add witness related docs #12403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,8 @@
- [Maintain TiDB Using TiUP](/maintain-tidb-using-tiup.md)
- [Modify Configuration Dynamically](/dynamic-config.md)
- [Online Unsafe Recovery](/online-unsafe-recovery.md)
- [Use Witness Replicas to Save Costs](/use-witness-to-save-costs.md)
- [Use Witness Replicas to Speed Up Failover](/use-witness-to-speed-up-failover.md)
- [Replicate Data Between Primary and Secondary Clusters](/replicate-between-primary-and-secondary-clusters.md)
- Monitor and Alert
- [Monitoring Framework Overview](/tidb-monitoring-framework.md)
Expand Down
32 changes: 31 additions & 1 deletion configure-placement-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ aliases: ['/docs/dev/configure-placement-rules/','/docs/dev/how-to/configure/pla

Placement Rules, introduced in v5.0, is a replica rule system that guides PD to generate corresponding schedules for different types of data. By combining different scheduling rules, you can finely control the attributes of any continuous data range, such as the number of replicas, the storage location, the host type, whether to participate in Raft election, and whether to act as the Raft leader.

The Placement Rules feature is enabled by default in v5.0 and later versions of TiDB. To disable it, refer to [Disable Placement Rules](#disable-placement-rules).
The Placement Rules feature is enabled by default in v5.0 and later versions of TiDB. To disable it, refer to [Disable Placement Rules](#disable-placement-rules).

## Rule system

Expand All @@ -37,6 +37,7 @@ The following table shows the meaning of each field in a rule:
| `StartKey` | `string`, in hexadecimal form | Applies to the starting key of a range. |
| `EndKey` | `string`, in hexadecimal form | Applies to the ending key of a range. |
| `Role` | `string` | Replica roles, including voter/leader/follower/learner. |
| `IsWitness` | `true`/`false` | Whether it is a [Witness](/glossary.md#witness) replica or not. |
| `Count` | `int`, positive integer | The number of replicas. |
| `LabelConstraint` | `[]Constraint` | Filters nodes based on the label. |
| `LocationLabels` | `[]string` | Used for physical isolation. |
Expand Down Expand Up @@ -486,3 +487,32 @@ The rule group:
"override": true,
}
```

### Scenario 6: Configure Witness replicas in a highly reliable storage environment

The following rule shows how to configure `IsWitness` and uses Amazon EBS as an example to save costs by configuring [Witness](/glossary.md#witness) replicas.

The rule is as follows:

```json
[
{
"group_id": "pd",
"id": "default",
"start_key": "",
"end_key": "",
"role": "voter",
"is_witness": false,
"count": 2
},
{
"group_id": "pd",
"id": "witness",
"start_key": "",
"end_key": "",
"role": "voter",
"is_witness": true,
"count": 1
}
]
```
9 changes: 9 additions & 0 deletions glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,3 +159,12 @@ Because TiKV is a distributed storage system, it requires a global timing servic
### TTL

[Time to live (TTL)](/time-to-live.md) is a feature that allows you to manage TiDB data lifetime at the row level. For a table with the TTL attribute, TiDB automatically checks data lifetime and deletes expired data at the row level.

## W

### Witness

A Witness replica only stores the most recent Raft logs for majority confirmation, but does not store data. Witness replicas are applicable to the following scenarios:

- Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
- Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
20 changes: 20 additions & 0 deletions pd-configuration-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,11 @@ Configuration items related to scheduling
+ Controls the time interval between the `split` and `merge` operations on the same Region. That means a newly split Region will not be merged for a while.
+ Default value: `1h`

### `switch-witness-interval` <span class="version-mark">New in v7.0.0</span>

+ Controls the time interval in switching between [Witness](/glossary.md#witness) and non-Witness operations on the same Region. That means a Region newly switched to non-Witness cannot be switched to Witness for a while.
+ Default value: `1h`

### `max-snapshot-count`

+ Controls the maximum number of snapshots that a single store receives or sends at the same time. PD schedulers depend on this configuration to prevent the resources used for normal traffic from being preempted.
Expand Down Expand Up @@ -277,6 +282,21 @@ Configuration items related to scheduling
+ The number of the `Region Merge` scheduling tasks performed at the same time. Set this parameter to `0` to disable `Region Merge`.
+ Default value: `8`

### `witness-schedule-limit` <span class="version-mark">New in v7.0.0</span>

+ Controls the concurrency of Witness scheduling tasks.
+ Default value: `4`
+ Minimum value: `1`
+ Maximum value: `9`

### `enable-witness` <span class="version-mark">New in v7.0.0</span>

+ Controls whether to enable the Witness replica feature.
+ Witness replicas are applicable to the following scenarios:
- Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
- Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).
+ Default value: `false`

### `high-space-ratio`

+ The threshold ratio below which the capacity of the store is sufficient. If the space occupancy ratio of the store is smaller than this threshold value, PD ignores the remaining space of the store when performing scheduling, and balances load mainly based on the Region size. This configuration takes effect only when `region-score-formula-version` is set to `v1`.
Expand Down
12 changes: 12 additions & 0 deletions pd-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ Usage:
},
"schedule": {
"enable-cross-table-merge": "true",
"enable-witness": "true",
"high-space-ratio": 0.7,
"hot-region-cache-hits-threshold": 3,
"hot-region-schedule-limit": 4,
Expand Down Expand Up @@ -1088,6 +1089,17 @@ unsafe remove-failed-stores show
]
```

To enable the [Witness replica](/glossary.md#witness) feature, run the following command:

```bash
config set enable-witness true
```

Witness replicas are applicable to the following scenarios:

- Save costs in a highly reliable storage environment. For more details, see [Use Witness replicas to save costs](/use-witness-to-save-costs.md).
- Quickly recover from any failure to improve system availability. For more details, see [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).

## Jq formatted JSON output usage

### Simplify the output of `store`
Expand Down
48 changes: 48 additions & 0 deletions use-witness-to-save-costs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Use Witness Replicas to Save Costs
summary: Learn how to use Witness replicas to save costs in a highly reliable storage environment.
---

# Use Witness Replicas to Save Costs

This document describes how to use Witness replicas to save costs in a highly reliable storage environment. If you need to use Witness replicas to improve the durability when a TiKV node is down, refer to [Use Witness replicas to speed up failover](/use-witness-to-speed-up-failover.md).

## Feature description

In cloud environments, it is recommended to use Amazon Elastic Block Store (EBS) with 99.8%~99.9% durability or Persistent Disk of Google Cloud Platform (GCP) with 99.99%~99.999% durability as the storage of each TiKV node. In this case, using three Raft replicas with TiKV is possible but not necessary. To reduce costs, TiKV introduces the Witness replica, which is the "2 Replicas With 1 Log Only" mechanism. The 1 Log Only replica only stores Raft logs and does not apply data, and still ensures data consistency through the Raft protocol. Compared with the standard three replica architecture, the Witness replica can save storage resources and CPU usage.

> **Warning:**
>
> The Withness replica is introduced in v6.6.0 and is not compatible with previous versions. It is not supported to downgrade.

## User scenarios

In a highly reliable storage environment (99.8%~99.9%), such as Amazon EBS and Persistent Disk of GCP, you can enable and configure Witness replicas to save costs.

## Usage

### Step 1: Enable Witness

To enable Witness, use PD Control to run the `config set enable-witness true` command:

```bash
pd-ctl config set enable-witness true
```

If the command returns `Success`, the Witness replica feature is enabled. If you have not configured Witness replicas using Placement Rules, no Witness replicas will be created by default. Only when a TiKV node is down, a Witness replica will be added immediately and will be promoted to a normal Voter later.

### Step 2: Configure Witness replicas

Assume that three replicas are present. Modify `rule.json` to the configuration in [Scenario 6: Configure Witness replicas in a highly reliable storage environment](/configure-placement-rules.md#scenario-6-configure-witness-replicas-in-a-highly-reliable-storage-environment).

After editing the file, use the following command to save the configuration to the PD server:

```bash
pd-ctl config placement-rules save --in=rule.json
```

## Notes

- It is recommended to configure Witness replicas only in a highly reliable storage environment, such as Amazon EBS with 99.8%~99.9% durability and Persistent Disk of GCP with 99.99%~99.999% durability to store TiKV nodes.
- Since a Witness replica does not apply Raft logs, it cannot provide read and write services. When the Leader is down and the remaining Voters do not have the latest Raft logs, Raft elects the Witness replica as a Leader. After the Witness replica is elected, it sends Raft logs to Voters and transfers the leader to a Voter. If the Witness replica cannot transfer the leader in time, the application might receive an `IsWitness` error after the Backoff timeout.
- When there is a pending Voter in the system, to prevent the Witness replica from accumulating too many Raft logs and occupying the entire disk space, the system will promote the Witness replica to a normal Voter.
30 changes: 30 additions & 0 deletions use-witness-to-speed-up-failover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Use Witness Replicas to Speed Up Failover
summary: Learn how to use a Witness replica to speed up failover.
---

# Use Witness Replicas to Speed Up Failover

This document describes how to use Witness replicas to improve durability when a TiKV node is down. If you need to use Witness replicas to save costs in a high-reliability storage environment, refer to [Use Witness replicas to save costs](/use-witness-to-save-costs.md).

## Feature description

The Witness feature can be used to quickly recover from any failure (failover) to improve system availability and data durability. For example, in a Raft group of three replicas, if one replica fails, the system is fragile although it meets the majority requirement. It takes a long time to recover a new member (the process requires copying the snapshot first and then applying the latest logs), especially when the Region snapshot is large. In addition, the process of copying replicas might cause more pressure on unhealthy Group members. Therefore, adding a Witness replica can quickly remove the unhealthy node, reduce the risk of the Raft group being unavailable due to another node failure when recovering a new member (the Learner replica cannot participate in the election and submission), and ensure the security of logs during recovery.

> **Warning:**
>
> The Withness replica is introduced in v6.6.0 and is not compatible with previous versions. It is not supported to downgrade.

## User scenarios

In a scenario where you want to quickly recover from any failure to improve durability, you can enable Witness without configuring a Witness replica.

## Usage

To enable Witness, use PD Control to run the `config set enable-witness true` command:

```bash
pd-ctl config set enable-witness true
```

If the command returns `Success`, the Witness replica feature is enabled. If you have not configured Witness replicas according to [Use Witness replicas to save costs](/use-witness-to-save-costs.md), no Witness replicas will be created by default. Only when a TiKV node is down, a Witness replica will be added immediately and will be promoted to a normal Voter later.