Skip to content

Commit 33c4d20

Browse files
authored
Merge pull request #1799 from EnterpriseDB/bdr/josh/public-content-import
2 parents d2249e6 + 669ffb8 commit 33c4d20

39 files changed

+15277
-63
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,3 +81,4 @@ product_docs/content/
8181
product_docs/content_build/
8282
static/nginx_redirects.generated
8383
temp_kubernetes/
84+
temp_bdr/

product_docs/docs/bdr/3.7/appusage.mdx

Lines changed: 693 additions & 0 deletions
Large diffs are not rendered by default.

product_docs/docs/bdr/3.7/backup.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: Backup and Recovery
3+
originalFilePath: backup.md
4+
5+
---
6+
7+
<AuthenticatedContentPlaceholder target="https://documentation.2ndquadrant.com/bdr3-enterprise/release/latest/backup/" topic="Backup and Recovery" />

product_docs/docs/bdr/3.7/camo.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
navTitle: Commit at Most Once
3+
title: Commit At Most Once
4+
originalFilePath: camo.md
5+
6+
---
7+
8+
<AuthenticatedContentPlaceholder target="https://documentation.2ndquadrant.com/bdr3-enterprise/release/latest/camo/" topic="Commit At Most Once" />
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
navTitle: 'Appendix E: CAMO Reference Clients'
3+
title: 'Appendix E: CAMO Reference Client Implementations'
4+
originalFilePath: camo_clients.md
5+
6+
---
7+
8+
<AuthenticatedContentPlaceholder target="https://documentation.2ndquadrant.com/bdr3-enterprise/release/latest/camo_clients/" topic="Appendix E: CAMO Reference Client Implementations" />
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
title: Catalogs and Views
3+
originalFilePath: catalogs.md
4+
5+
---
6+
7+
<AuthenticatedContentPlaceholder target="https://documentation.2ndquadrant.com/bdr3-enterprise/release/latest/catalogs/" topic="Catalogs and Views" />
Lines changed: 350 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,350 @@
1+
---
2+
navTitle: Column-Level Conflict Resolution
3+
title: Column-Level Conflict Detection
4+
originalFilePath: column-level-conflicts.md
5+
6+
---
7+
8+
By default, conflicts are resolved at row level. That is, when changes
9+
from two nodes conflict, we pick either the local or remote tuple and
10+
discard the other one. For example, we may compare commit timestamps for
11+
the two conflicting changes and keep the newer one. This ensures that all
12+
nodes converge to the same result, and establishes commit-order-like
13+
semantics on the whole cluster.
14+
15+
However, in some cases it may be appropriate to resolve conflicts at
16+
the column-level rather than the row-level.
17+
18+
Consider a simple example, where we have a table "t" with two integer
19+
columns "a" and "b", and a single row `(1,1)`. Assume that on one node
20+
we execute:
21+
22+
```sql
23+
UPDATE t SET a = 100
24+
```
25+
26+
...while on another node we concurrently (before receiving the preceding
27+
`UPDATE`) execute:
28+
29+
```sql
30+
UPDATE t SET b = 100
31+
```
32+
33+
This results in an `UPDATE-UPDATE` conflict. With the `update_if_newer`
34+
conflict resolution, we compare the commit timestamps and keep the new
35+
row version. Assuming the second node committed last, we end up with
36+
`(1,100)`, effectively discarding the change to column "a".
37+
38+
For many use cases this is the desired and expected behaviour, but for
39+
some this may be an issue - consider for example a multi-node cluster
40+
where each part of the application is connected to a different node,
41+
updating a dedicated subset of columns in a shared table. In that case,
42+
the different components may step on each other's toes, overwriting
43+
their changes.
44+
45+
For such use cases, it may be more appropriate to resolve conflicts on
46+
a given table at the column-level. To achieve that, BDR will track
47+
the timestamp of the last change for each column separately, and use that
48+
to pick the most recent value (essentially `update_if_newer`).
49+
50+
Applied to the previous example, we'll end up with `(100,100)` on both
51+
nodes, despite neither of the nodes ever seeing such a row.
52+
53+
When thinking about column-level conflict resolution, it may be useful
54+
to see tables as vertically partitioned, so that each update affects
55+
data in only one slice. This eliminates conflicts between changes to
56+
different subsets of columns. In fact, vertical partitioning may even
57+
be a practical alternative to column-level conflict resolution.
58+
59+
Column-level conflict resolution requires the table to have `REPLICA
60+
IDENTITY FULL`. The `bdr.alter_table_conflict_detection` function does check
61+
that, and will fail with an error otherwise.
62+
63+
## Enabling and Disabling Column-Level Conflict Resolution
64+
65+
The Column-Level Conflict Resolution is managed by the
66+
[bdr.alter_table_conflict_detection()](conflicts#bdralter_table_conflict_detection)
67+
function.
68+
69+
### Example
70+
71+
To illustrate how the `bdr.alter_table_conflict_detection()` is used, consider
72+
this example that creates a trivial table `test_table` and then enable
73+
column-level conflict resolution on it:
74+
75+
```sql
76+
db=# CREATE TABLE my_app.test_table (id SERIAL PRIMARY KEY, val INT);
77+
CREATE TABLE
78+
79+
db=# ALTER TABLE my_app.test_table REPLICA IDENTITY FULL;
80+
ALTER TABLE
81+
82+
db=# SELECT bdr.alter_table_conflict_detection(
83+
db(# 'my_app.test_table'::regclass, 'column_modify_timestamp', 'cts');
84+
alter_table_conflict_detection
85+
--------------------------------
86+
t
87+
88+
db=# \d my_app.test_table
89+
```
90+
91+
You will see that the function adds a new `cts` column (as specified in
92+
the function call), but it also created two triggers ( `BEFORE INSERT`
93+
and `BEFORE UPDATE` ) that are responsible for maintaining timestamps
94+
in the new column before each change.
95+
96+
Also worth mentioning is that the new column specifies `NOT NULL`
97+
with a default value, which means that `ALTER TABLE ... ADD COLUMN`
98+
does not perform a table rewrite.
99+
100+
*Note*: We discourage using columns with the `bdr.column_timestamps` data type
101+
for other purposes as it may have various negative effects
102+
(it switches the table to column-level conflict resolution, which will
103+
not work correctly without the triggers etc.).
104+
105+
### Listing Table with Column-Level Conflict Resolution
106+
107+
Tables having column-level conflict resolution enabled can be listed
108+
with the following query, which detects the presence of a column of
109+
type `bdr.column_timestamp`:
110+
111+
```sql
112+
SELECT nc.nspname, c.relname
113+
FROM pg_attribute a
114+
JOIN (pg_class c JOIN pg_namespace nc ON c.relnamespace = nc.oid)
115+
ON a.attrelid = c.oid
116+
JOIN (pg_type t JOIN pg_namespace nt ON t.typnamespace = nt.oid)
117+
ON a.atttypid = t.oid
118+
WHERE NOT pg_is_other_temp_schema(nc.oid)
119+
AND nt.nspname = 'bdr'
120+
AND t.typname = 'column_timestamps'
121+
AND NOT a.attisdropped
122+
AND c.relkind IN ('r', 'v', 'f', 'p');
123+
```
124+
125+
### bdr.column_timestamps_create
126+
127+
This function creates column-level conflict resolution. This is called within
128+
`column_timestamp_enable`.
129+
130+
#### Synopsis
131+
132+
```sql
133+
bdr.column_timestamps_create(p_source cstring, p_timestamp timestampstz)
134+
```
135+
136+
#### Parameters
137+
138+
- `p_source` - The two options are 'current' or 'commit'.
139+
- `p_timestamp` - Timestamp is dependent on the source chosen: if 'commit',
140+
then TIMESTAMP_SOURCE_COMMIT; if 'current', then TIMESTAMP_SOURCE_CURRENT.
141+
142+
## DDL Locking
143+
144+
When enabling or disabling column timestamps on a table, the code uses
145+
DDL locking to ensure that there are no pending changes from before the
146+
switch, to ensure we only see conflicts with either timestamps in both
147+
tuples or neither of them. Otherwise, the code might unexpectedly see
148+
timestamps in the local tuple and NULL in the remote one. It also
149+
ensures that the changes are resolved the same way (column-level or
150+
row-level) on all nodes.
151+
152+
## Current vs Commit Timestamp
153+
154+
An important question is what timestamp to assign to modified columns.
155+
156+
By default, the timestamp assigned to modified columns is the current
157+
timestamp, as if obtained from `clock_timestamp`. This is simple, and
158+
for many cases it is perfectly correct (e.g. when the conflicting rows
159+
modify non-overlapping subsets of columns).
160+
161+
It may however have various unexpected effects:
162+
163+
- The timestamp changes during statement execution, so if an `UPDATE`
164+
affects multiple rows, each will get a slightly different timestamp.
165+
This means that the effects of concurrent changes may get "mixed" in various
166+
ways (depending on how exactly the changes performed on different
167+
nodes interleave).
168+
169+
- The timestamp is unrelated to the commit timestamp, and using it to
170+
resolve conflicts means that the result is not equivalent to the commit order,
171+
which means it likely is not serializable.
172+
173+
Note: We may add statement and transaction timestamps in the future,
174+
which would address issues with mixing effects of concurrent statements or
175+
transactions. Still, neither of these options can ever produce results
176+
equivalent to commit order.
177+
178+
It is possible to also use the actual commit timestamp, although this
179+
feature is currently considered experimental. To use the commit timestamp,
180+
set the last parameter to `true` when enabling column-level conflict
181+
resolution:
182+
183+
```sql
184+
SELECT bdr.column_timestamps_enable('test_table'::regclass, 'cts', true);
185+
```
186+
187+
This can also be disabled using `bdr.column_timestamps_disable`.
188+
189+
Commit timestamps currently have a couple of restrictions that are
190+
explained in the "Limitations" section.
191+
192+
## Inspecting Column Timestamps
193+
194+
The column storing timestamps for modified columns is maintained
195+
automatically by triggers, and must not be modified directly. It may
196+
be useful to inspect the current timestamps value, for example while
197+
investigating how a particular conflict was resolved.
198+
199+
There are three functions for this purpose:
200+
201+
- `bdr.column_timestamps_to_text(bdr.column_timestamps)`
202+
203+
This function returns a human-readable representation of the timestamp mapping, and
204+
is used when casting the value to `text`:
205+
206+
```sql
207+
db=# select cts::text from test_table;
208+
cts
209+
-----------------------------------------------------------------------------------------------------
210+
{source: current, default: 2018-09-23 19:24:52.118583+02, map: [2 : 2018-09-23 19:25:02.590677+02]}
211+
(1 row)
212+
213+
```
214+
215+
- `bdr.column_timestamps_to_jsonb(bdr.column_timestamps)`
216+
217+
This function turns a JSONB representation of the timestamps mapping, and is used
218+
when casting the value to `jsonb`:
219+
220+
```sql
221+
db=# select jsonb_pretty(cts::jsonb) from test_table;
222+
jsonb_pretty
223+
---------------------------------------------------
224+
{ +
225+
"map": { +
226+
"2": "2018-09-23T19:24:52.118583+02:00" +
227+
}, +
228+
"source": "current", +
229+
"default": "2018-09-23T19:24:52.118583+02:00"+
230+
}
231+
(1 row)
232+
```
233+
234+
- `bdr.column_timestamps_resolve(bdr.column_timestamps, xid)`
235+
236+
This function updates the mapping with the commit timestamp for the attributes modified
237+
by the most recent transaction (if it already committed). This only
238+
matters when using the commit timestamp. For example in this case, the last
239+
transaction updated the second attribute (with `attnum = 2`):
240+
241+
```sql
242+
test=# select cts::jsonb from test_table;
243+
cts
244+
----------------------------------------------------------------------------------------------------------------------------------------
245+
{"map": {"2": "2018-09-23T19:29:55.581823+02:00"}, "source": "commit", "default": "2018-09-23T19:29:55.581823+02:00", "modified": [2]}
246+
(1 row)
247+
248+
db=# select bdr.column_timestamps_resolve(cts, xmin)::jsonb from test_table;
249+
column_timestamps_resolve
250+
-----------------------------------------------------------------------------------------------------------------------
251+
{"map": {"2": "2018-09-23T19:29:55.581823+02:00"}, "source": "commit", "default": "2018-09-23T19:29:55.581823+02:00"}
252+
(1 row)
253+
```
254+
255+
## Handling column conflicts using CRDT Data Types
256+
257+
By default, column-level conflict resolution simply picks the value with
258+
a higher timestamp and discards the other one. It is however possible to
259+
reconcile the conflict in different (more elaborate) ways, for example
260+
using CRDT types that allow "merging" the conflicting values without
261+
discarding any information.
262+
263+
While pglogical does not include any such data types, it allows adding
264+
them separately and registering them in a catalog `crdt_handlers`. Aside
265+
from the usual data type functions (input/output, ...) each CRDT type
266+
has to implement a merge function, which takes exactly three arguments
267+
(local value, old remote value, new remote value) and produces a value
268+
merging information from those three values.
269+
270+
## Limitations
271+
272+
- The attributes modified by an `UPDATE` are determined by comparing the
273+
old and new row in a trigger. This means that if the attribute does
274+
not change a value, it will not be detected as modified even if it is
275+
explicitly set. For example, `UPDATE t SET a = a` will not mark `a` as
276+
modified for any row. Similarly, `UPDATE t SET a = 1` will not mark
277+
`a` as modified for rows that are already set to `1`.
278+
279+
- For `INSERT` statements, we do not have any old row to compare the new
280+
one to, so we consider all attributes to be modified and assign them
281+
a new timestamp. This applies even for columns that were not included
282+
in the `INSERT` statement and received default values. We could detect
283+
which attributes have a default value, but it is not possible to decide if
284+
it was included automatically or specified explicitly by the user.
285+
286+
This effectively means column-level conflict resolution does not work
287+
for `INSERT-INSERT` conflicts (even if the `INSERT` statements specify
288+
different subsets of columns, because the newer row will have all
289+
timestamps newer than the older one).
290+
291+
- By treating the columns independently, it is easy to violate constraints
292+
in a way that would not be possible when all changes happen on the same
293+
node. Consider for example a table like this:
294+
295+
```sql
296+
CREATE TABLE t (id INT PRIMARY KEY, a INT, b INT, CHECK (a > b));
297+
INSERT INTO t VALUES (1, 1000, 1);
298+
```
299+
300+
...and assume one node does:
301+
302+
```sql
303+
UPDATE t SET a = 100;
304+
```
305+
306+
...while another node does concurrently:
307+
308+
```sql
309+
UPDATE t SET b = 500;
310+
```
311+
312+
Each of those updates is valid when executed on the initial row, and
313+
so will pass on each node. But when replicating to the other node,
314+
the resulting row violates the `CHECK (A > b)` constraint, and the
315+
replication will stop until the issue is resolved manually.
316+
317+
- The column storing timestamp mapping is managed automatically. Do not
318+
specify or override the value in your queries, as it may result in
319+
unpredictable effects (we do ignore the value where possible anyway).
320+
321+
- The timestamp mapping is maintained by triggers, but the order in which
322+
triggers execute does matter. So if you have custom triggers that modify
323+
tuples and are executed after the `pgl_clcd_` triggers, the modified
324+
columns will not be detected correctly.
325+
326+
- When using regular timestamps to order changes/commits, it is possible
327+
that the conflicting changes have exactly the same timestamp (because
328+
two or more nodes happened to generate the same timestamp). This risk
329+
is not unique to column-level conflict resolution, as it may happen
330+
even for regular row-level conflict resolution, and we use node id as a
331+
tie-breaker in this situation (the higher node id wins), which ensures that
332+
same changes are applied on all nodes.
333+
334+
- It is possible that there is a clock skew between different nodes. While it
335+
may induce somewhat unexpected behavior (discarding seemingly newer
336+
changes because the timestamps are inverted), clock skew between nodes can
337+
be managed using the parameters `bdr.maximum_clock_skew` and
338+
`bdr.maximum_clock_skew_action`.
339+
340+
- The underlying pglogical subscription must not discard any changes,
341+
which could easily cause divergent errors (particularly for
342+
CRDT types). The subscriptions must have `ignore_redundant_updates`
343+
set to false (which is the default).
344+
345+
Existing groups created with non-default value for `ignore_redundant_updates`
346+
can be altered like this:
347+
348+
```sql
349+
SELECT bdr.alter_node_group_config('group', ignore_redundant_updates := false);
350+
```

0 commit comments

Comments
 (0)