Skip to content

Commit e7e973e

Browse files
Zakellymasteryhx
authored andcommitted
[FLINK-34458][checkpointing] Rename options for Generalized incremental checkpoints (changelog) (apache#24324)
1 parent 9308e10 commit e7e973e

File tree

12 files changed

+88
-69
lines changed

12 files changed

+88
-69
lines changed

docs/content.zh/docs/deployment/config.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -466,11 +466,11 @@ Advanced options to tune RocksDB and RocksDB checkpoints.
466466
### State Changelog Options
467467

468468
Please refer to [State Backends]({{< ref "docs/ops/state/state_backends#enabling-changelog" >}}) for information on
469-
using State Changelog. {{< generated/state_backend_changelog_section >}}
469+
using State Changelog. {{< generated/state_changelog_section >}}
470470

471471
#### FileSystem-based Changelog options
472472

473-
These settings take effect when the `state.backend.changelog.storage` is set to `filesystem` (see [above](#state-backend-changelog-storage)).
473+
These settings take effect when the `state.changelog.storage` is set to `filesystem` (see [above](#state-changelog-storage)).
474474
{{< generated/fs_state_changelog_configuration >}}
475475

476476
**RocksDB Configurable Options**

docs/content.zh/docs/ops/metrics.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1734,7 +1734,7 @@ Note that the metrics are only available via reporters.
17341734
</tr>
17351735
<tr>
17361736
<td>changelogBusyTimeMsPerSecond</td>
1737-
<td>The time (in milliseconds) taken by the Changelog state backend to do IO operations, only positive when Changelog state backend is enabled. Please check 'dstl.dfs.upload.max-in-flight' for more information.</td>
1737+
<td>The time (in milliseconds) taken by the Changelog state backend to do IO operations, only positive when Changelog state backend is enabled. Please check 'state.changelog.dstl.dfs.upload.max-in-flight' for more information.</td>
17381738
<td>Gauge</td>
17391739
</tr>
17401740
<tr>

docs/content.zh/docs/ops/state/state_backends.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ Changelog 是一项旨在减少 checkpointing 时间的功能,因此也可以
383383
值得注意的是虽然 Changelog 增加了少量的日常 CPU 和网络带宽资源使用,
384384
但会降低峰值的 CPU 和网络带宽使用量。
385385

386-
另一项需要考虑的事情是恢复时间。取决于 `state.backend.changelog.periodic-materialize.interval` 的设置,changelog 可能会变得冗长,因此重放会花费更多时间。即使这样,恢复时间加上 checkpoint 持续时间仍然可能低于不开启 changelog 功能的时间,从而在故障恢复的情况下也能提供更低的端到端延迟。当然,取决于上述时间的实际比例,有效恢复时间也有可能会增加。
386+
另一项需要考虑的事情是恢复时间。取决于 `state.changelog.periodic-materialize.interval` 的设置,changelog 可能会变得冗长,因此重放会花费更多时间。即使这样,恢复时间加上 checkpoint 持续时间仍然可能低于不开启 changelog 功能的时间,从而在故障恢复的情况下也能提供更低的端到端延迟。当然,取决于上述时间的实际比例,有效恢复时间也有可能会增加。
387387

388388
有关更多详细信息,请参阅 [FLIP-158](https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints)。
389389

@@ -401,9 +401,9 @@ Changelog 是一项旨在减少 checkpointing 时间的功能,因此也可以
401401

402402
这是 YAML 中的示例配置:
403403
```yaml
404-
state.backend.changelog.enabled: true
405-
state.backend.changelog.storage: filesystem # 当前只支持 filesystem 和 memory(仅供测试用)
406-
dstl.dfs.base-path: s3://<bucket-name> # 类似于 state.checkpoints.dir
404+
state.changelog.enabled: true
405+
state.changelog.storage: filesystem # 当前只支持 filesystem 和 memory(仅供测试用)
406+
state.changelog.dstl.dfs.base-path: s3://<bucket-name> # 类似于 state.checkpoints.dir
407407
```
408408

409409
请将如下配置保持默认值 (参见[限制](#limitations)):

docs/content/docs/deployment/config.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -468,11 +468,11 @@ Advanced options to tune RocksDB and RocksDB checkpoints.
468468
### State Changelog Options
469469

470470
Please refer to [State Backends]({{< ref "docs/ops/state/state_backends#enabling-changelog" >}}) for information on
471-
using State Changelog. {{< generated/state_backend_changelog_section >}}
471+
using State Changelog. {{< generated/state_changelog_section >}}
472472

473473
#### FileSystem-based Changelog options
474474

475-
These settings take effect when the `state.backend.changelog.storage` is set to `filesystem` (see [above](#state-backend-changelog-storage)).
475+
These settings take effect when the `state.changelog.storage` is set to `filesystem` (see [above](#state-changelog-storage)).
476476
{{< generated/fs_state_changelog_configuration >}}
477477

478478
**RocksDB Configurable Options**

docs/content/docs/ops/metrics.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1724,7 +1724,7 @@ Note that the metrics are only available via reporters.
17241724
</tr>
17251725
<tr>
17261726
<td>changelogBusyTimeMsPerSecond</td>
1727-
<td>The time (in milliseconds) taken by the Changelog state backend to do IO operations, only positive when Changelog state backend is enabled. Please check 'dstl.dfs.upload.max-in-flight' for more information.</td>
1727+
<td>The time (in milliseconds) taken by the Changelog state backend to do IO operations, only positive when Changelog state backend is enabled. Please check 'state.changelog.dstl.dfs.upload.max-in-flight' for more information.</td>
17281728
<td>Gauge</td>
17291729
</tr>
17301730
<tr>

docs/content/docs/ops/state/state_backends.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -384,7 +384,7 @@ However, resource usage is higher:
384384
It is worth noting that changelog adds a small amount of daily CPU and network bandwidth resources,
385385
but reduces peak CPU and network bandwidth usage.
386386

387-
Recovery time is another thing to consider. Depending on the `state.backend.changelog.periodic-materialize.interval`
387+
Recovery time is another thing to consider. Depending on the `state.changelog.periodic-materialize.interval`
388388
setting, the changelog can become lengthy and replaying it may take more time. However, recovery time combined with
389389
checkpoint duration will likely still be lower than in non-changelog setups, providing lower end-to-end latency even in
390390
failover case. However, it's also possible that the effective recovery time will increase, depending on the actual ratio
@@ -402,9 +402,9 @@ Make sure to [add]({{< ref "docs/deployment/filesystems/overview" >}}) the neces
402402

403403
Here is an example configuration in YAML:
404404
```yaml
405-
state.backend.changelog.enabled: true
406-
state.backend.changelog.storage: filesystem # currently, only filesystem and memory (for tests) are supported
407-
dstl.dfs.base-path: s3://<bucket-name> # similar to state.checkpoints.dir
405+
state.changelog.enabled: true
406+
state.changelog.storage: filesystem # currently, only filesystem and memory (for tests) are supported
407+
state.changelog.dstl.dfs.base-path: s3://<bucket-name> # similar to state.checkpoints.dir
408408
```
409409

410410
Please keep the following defaults (see [limitations](#limitations)):

docs/layouts/shortcodes/generated/fs_state_changelog_configuration.html

+18-18
Original file line numberDiff line numberDiff line change
@@ -9,88 +9,88 @@
99
</thead>
1010
<tbody>
1111
<tr>
12-
<td><h5>dstl.dfs.base-path</h5></td>
12+
<td><h5>state.changelog.dstl.dfs.base-path</h5></td>
1313
<td style="word-wrap: break-word;">(none)</td>
1414
<td>String</td>
1515
<td>Base path to store changelog files.</td>
1616
</tr>
1717
<tr>
18-
<td><h5>dstl.dfs.batch.persist-delay</h5></td>
18+
<td><h5>state.changelog.dstl.dfs.batch.persist-delay</h5></td>
1919
<td style="word-wrap: break-word;">10 ms</td>
2020
<td>Duration</td>
2121
<td>Delay before persisting changelog after receiving persist request (on checkpoint). Minimizes the number of files and requests if multiple operators (backends) or sub-tasks are using the same store. Correspondingly increases checkpoint time (async phase).</td>
2222
</tr>
2323
<tr>
24-
<td><h5>dstl.dfs.batch.persist-size-threshold</h5></td>
24+
<td><h5>state.changelog.dstl.dfs.batch.persist-size-threshold</h5></td>
2525
<td style="word-wrap: break-word;">10 mb</td>
2626
<td>MemorySize</td>
27-
<td>Size threshold for state changes that were requested to be persisted but are waiting for dstl.dfs.batch.persist-delay (from all operators). . Once reached, accumulated changes are persisted immediately. This is different from dstl.dfs.preemptive-persist-threshold as it happens AFTER the checkpoint and potentially for state changes of multiple operators. Must not exceed in-flight data limit (see below)</td>
27+
<td>Size threshold for state changes that were requested to be persisted but are waiting for state.changelog.dstl.dfs.batch.persist-delay (from all operators). . Once reached, accumulated changes are persisted immediately. This is different from state.changelog.dstl.dfs.preemptive-persist-threshold as it happens AFTER the checkpoint and potentially for state changes of multiple operators. Must not exceed in-flight data limit (see below)</td>
2828
</tr>
2929
<tr>
30-
<td><h5>dstl.dfs.compression.enabled</h5></td>
30+
<td><h5>state.changelog.dstl.dfs.compression.enabled</h5></td>
3131
<td style="word-wrap: break-word;">false</td>
3232
<td>Boolean</td>
3333
<td>Whether to enable compression when serializing changelog.</td>
3434
</tr>
3535
<tr>
36-
<td><h5>dstl.dfs.discard.num-threads</h5></td>
36+
<td><h5>state.changelog.dstl.dfs.discard.num-threads</h5></td>
3737
<td style="word-wrap: break-word;">1</td>
3838
<td>Integer</td>
3939
<td>Number of threads to use to discard changelog (e.g. pre-emptively uploaded unused state).</td>
4040
</tr>
4141
<tr>
42-
<td><h5>dstl.dfs.download.local-cache.idle-timeout-ms</h5></td>
42+
<td><h5>state.changelog.dstl.dfs.download.local-cache.idle-timeout-ms</h5></td>
4343
<td style="word-wrap: break-word;">10 min</td>
4444
<td>Duration</td>
4545
<td>Maximum idle time for cache files of distributed changelog file, after which the cache files will be deleted.</td>
4646
</tr>
4747
<tr>
48-
<td><h5>dstl.dfs.preemptive-persist-threshold</h5></td>
48+
<td><h5>state.changelog.dstl.dfs.preemptive-persist-threshold</h5></td>
4949
<td style="word-wrap: break-word;">5 mb</td>
5050
<td>MemorySize</td>
5151
<td>Size threshold for state changes of a single operator beyond which they are persisted pre-emptively without waiting for a checkpoint. Improves checkpointing time by allowing quasi-continuous uploading of state changes (as opposed to uploading all accumulated changes on checkpoint).</td>
5252
</tr>
5353
<tr>
54-
<td><h5>dstl.dfs.upload.buffer-size</h5></td>
54+
<td><h5>state.changelog.dstl.dfs.upload.buffer-size</h5></td>
5555
<td style="word-wrap: break-word;">1 mb</td>
5656
<td>MemorySize</td>
5757
<td>Buffer size used when uploading change sets</td>
5858
</tr>
5959
<tr>
60-
<td><h5>dstl.dfs.upload.max-attempts</h5></td>
60+
<td><h5>state.changelog.dstl.dfs.upload.max-attempts</h5></td>
6161
<td style="word-wrap: break-word;">3</td>
6262
<td>Integer</td>
63-
<td>Maximum number of attempts (including the initial one) to perform a particular upload. Only takes effect if dstl.dfs.upload.retry-policy is fixed.</td>
63+
<td>Maximum number of attempts (including the initial one) to perform a particular upload. Only takes effect if state.changelog.dstl.dfs.upload.retry-policy is fixed.</td>
6464
</tr>
6565
<tr>
66-
<td><h5>dstl.dfs.upload.max-in-flight</h5></td>
66+
<td><h5>state.changelog.dstl.dfs.upload.max-in-flight</h5></td>
6767
<td style="word-wrap: break-word;">100 mb</td>
6868
<td>MemorySize</td>
69-
<td>Max amount of data allowed to be in-flight. Upon reaching this limit the task will be back-pressured. I.e., snapshotting will block; normal processing will block if dstl.dfs.preemptive-persist-threshold is set and reached. The limit is applied to the total size of in-flight changes if multiple operators/backends are using the same changelog storage. Must be greater than or equal to dstl.dfs.batch.persist-size-threshold</td>
69+
<td>Max amount of data allowed to be in-flight. Upon reaching this limit the task will be back-pressured. I.e., snapshotting will block; normal processing will block if state.changelog.dstl.dfs.preemptive-persist-threshold is set and reached. The limit is applied to the total size of in-flight changes if multiple operators/backends are using the same changelog storage. Must be greater than or equal to state.changelog.dstl.dfs.batch.persist-size-threshold</td>
7070
</tr>
7171
<tr>
72-
<td><h5>dstl.dfs.upload.next-attempt-delay</h5></td>
72+
<td><h5>state.changelog.dstl.dfs.upload.next-attempt-delay</h5></td>
7373
<td style="word-wrap: break-word;">500 ms</td>
7474
<td>Duration</td>
7575
<td>Delay before the next attempt (if the failure was not caused by a timeout).</td>
7676
</tr>
7777
<tr>
78-
<td><h5>dstl.dfs.upload.num-threads</h5></td>
78+
<td><h5>state.changelog.dstl.dfs.upload.num-threads</h5></td>
7979
<td style="word-wrap: break-word;">5</td>
8080
<td>Integer</td>
8181
<td>Number of threads to use for upload.</td>
8282
</tr>
8383
<tr>
84-
<td><h5>dstl.dfs.upload.retry-policy</h5></td>
84+
<td><h5>state.changelog.dstl.dfs.upload.retry-policy</h5></td>
8585
<td style="word-wrap: break-word;">"fixed"</td>
8686
<td>String</td>
8787
<td>Retry policy for the failed uploads (in particular, timed out). Valid values: none, fixed.</td>
8888
</tr>
8989
<tr>
90-
<td><h5>dstl.dfs.upload.timeout</h5></td>
90+
<td><h5>state.changelog.dstl.dfs.upload.timeout</h5></td>
9191
<td style="word-wrap: break-word;">1 s</td>
9292
<td>Duration</td>
93-
<td>Time threshold beyond which an upload is considered timed out. If a new attempt is made but this upload succeeds earlier then this upload result will be used. May improve upload times if tail latencies of upload requests are significantly high. Only takes effect if dstl.dfs.upload.retry-policy is fixed. Please note that timeout * max_attempts should be less than execution.checkpointing.timeout</td>
93+
<td>Time threshold beyond which an upload is considered timed out. If a new attempt is made but this upload succeeds earlier then this upload result will be used. May improve upload times if tail latencies of upload requests are significantly high. Only takes effect if state.changelog.dstl.dfs.upload.retry-policy is fixed. Please note that timeout * max_attempts should be less than execution.checkpointing.timeout</td>
9494
</tr>
9595
</tbody>
9696
</table>

docs/layouts/shortcodes/generated/state_changelog_configuration.html

+6-6
Original file line numberDiff line numberDiff line change
@@ -9,31 +9,31 @@
99
</thead>
1010
<tbody>
1111
<tr>
12-
<td><h5>state.backend.changelog.enabled</h5></td>
12+
<td><h5>state.changelog.enabled</h5></td>
1313
<td style="word-wrap: break-word;">false</td>
1414
<td>Boolean</td>
1515
<td>Whether to enable state backend to write state changes to StateChangelog. If this config is not set explicitly, it means no preference for enabling the change log, and the value in lower config level will take effect. The default value 'false' here means if no value set (job or cluster), the change log will not be enabled.</td>
1616
</tr>
1717
<tr>
18-
<td><h5>state.backend.changelog.max-failures-allowed</h5></td>
18+
<td><h5>state.changelog.max-failures-allowed</h5></td>
1919
<td style="word-wrap: break-word;">3</td>
2020
<td>Integer</td>
2121
<td>Max number of consecutive materialization failures allowed.</td>
2222
</tr>
2323
<tr>
24-
<td><h5>state.backend.changelog.periodic-materialize.enabled</h5></td>
24+
<td><h5>state.changelog.periodic-materialize.enabled</h5></td>
2525
<td style="word-wrap: break-word;">true</td>
2626
<td>Boolean</td>
2727
<td>Defines whether to enable periodic materialization, all changelogs will not be truncated which may increase the space of checkpoint if disabled</td>
2828
</tr>
2929
<tr>
30-
<td><h5>state.backend.changelog.periodic-materialize.interval</h5></td>
30+
<td><h5>state.changelog.periodic-materialize.interval</h5></td>
3131
<td style="word-wrap: break-word;">10 min</td>
3232
<td>Duration</td>
33-
<td>Defines the interval in milliseconds to perform periodic materialization for state backend. It only takes effect when state.backend.changelog.periodic-materialize.enabled is true</td>
33+
<td>Defines the interval in milliseconds to perform periodic materialization for state backend. It only takes effect when state.changelog.periodic-materialize.enabled is true</td>
3434
</tr>
3535
<tr>
36-
<td><h5>state.backend.changelog.storage</h5></td>
36+
<td><h5>state.changelog.storage</h5></td>
3737
<td style="word-wrap: break-word;">"memory"</td>
3838
<td>String</td>
3939
<td>The storage to be used to store state changelog.<br />The implementation can be specified via their shortcut name.<br />The list of recognized shortcut names currently includes 'memory' and 'filesystem'.</td>

docs/layouts/shortcodes/generated/state_backend_changelog_section.html docs/layouts/shortcodes/generated/state_changelog_section.html

+6-6
Original file line numberDiff line numberDiff line change
@@ -9,31 +9,31 @@
99
</thead>
1010
<tbody>
1111
<tr>
12-
<td><h5>state.backend.changelog.enabled</h5></td>
12+
<td><h5>state.changelog.enabled</h5></td>
1313
<td style="word-wrap: break-word;">false</td>
1414
<td>Boolean</td>
1515
<td>Whether to enable state backend to write state changes to StateChangelog. If this config is not set explicitly, it means no preference for enabling the change log, and the value in lower config level will take effect. The default value 'false' here means if no value set (job or cluster), the change log will not be enabled.</td>
1616
</tr>
1717
<tr>
18-
<td><h5>state.backend.changelog.max-failures-allowed</h5></td>
18+
<td><h5>state.changelog.max-failures-allowed</h5></td>
1919
<td style="word-wrap: break-word;">3</td>
2020
<td>Integer</td>
2121
<td>Max number of consecutive materialization failures allowed.</td>
2222
</tr>
2323
<tr>
24-
<td><h5>state.backend.changelog.periodic-materialize.enabled</h5></td>
24+
<td><h5>state.changelog.periodic-materialize.enabled</h5></td>
2525
<td style="word-wrap: break-word;">true</td>
2626
<td>Boolean</td>
2727
<td>Defines whether to enable periodic materialization, all changelogs will not be truncated which may increase the space of checkpoint if disabled</td>
2828
</tr>
2929
<tr>
30-
<td><h5>state.backend.changelog.periodic-materialize.interval</h5></td>
30+
<td><h5>state.changelog.periodic-materialize.interval</h5></td>
3131
<td style="word-wrap: break-word;">10 min</td>
3232
<td>Duration</td>
33-
<td>Defines the interval in milliseconds to perform periodic materialization for state backend. It only takes effect when state.backend.changelog.periodic-materialize.enabled is true</td>
33+
<td>Defines the interval in milliseconds to perform periodic materialization for state backend. It only takes effect when state.changelog.periodic-materialize.enabled is true</td>
3434
</tr>
3535
<tr>
36-
<td><h5>state.backend.changelog.storage</h5></td>
36+
<td><h5>state.changelog.storage</h5></td>
3737
<td style="word-wrap: break-word;">"memory"</td>
3838
<td>String</td>
3939
<td>The storage to be used to store state changelog.<br />The implementation can be specified via their shortcut name.<br />The list of recognized shortcut names currently includes 'memory' and 'filesystem'.</td>

flink-annotations/src/main/java/org/apache/flink/annotation/docs/Documentation.java

+1-1
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ public static final class Sections {
7979

8080
public static final String STATE_LATENCY_TRACKING = "state_latency_tracking";
8181

82-
public static final String STATE_BACKEND_CHANGELOG = "state_backend_changelog";
82+
public static final String STATE_CHANGELOG = "state_changelog";
8383

8484
public static final String EXPERT_CLASS_LOADING = "expert_class_loading";
8585
public static final String EXPERT_DEBUGGING_AND_TUNING = "expert_debugging_and_tuning";

0 commit comments

Comments
 (0)