Skip to content

Commit 8f7cb02

Browse files
committed
MSC4371: On the elimination of federation transactions.
Signed-off-by: Jason Volk <[email protected]>
1 parent ea0aef0 commit 8f7cb02

File tree

1 file changed

+69
-0
lines changed

1 file changed

+69
-0
lines changed
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# MSC4371: On the elimination of federation transactions
2+
3+
Server Specification [v1.16 § 4](https://spec.matrix.org/v1.16/server-server-api/#transactions)
4+
(including all prior versions) defines an envelope structure accompanying a protocol for common message
5+
transport between servers referred to as "transactions." These structures collect messages queued by
6+
an origin for a destination, then transmitted, acknowledged by the destination, and then this process
7+
is repeated with new messages queued by the origin in the interim.
8+
9+
Transactions have existed since the early protocol (circa 2014) when HTTP/1.1 was the common standard
10+
of transport. In HTTP/1 requests are processed sequentially within each connection. Multiple
11+
connections may be used for concurrent processing but a federation server will already be
12+
communicating to many destinations; minimizing connections between hosts is essential. Pipelining may
13+
also be used to hide latency but without explicit support by HTTP/1 there are many complications;
14+
protocol designers instead lean toward other solutions. From this environment federation transactions
15+
arose.
16+
17+
Ironically transactions succumb to the same shortcomings as HTTP/1 itself. The Matrix protocol
18+
specifies that only one transaction can be in flight at a time. The round-trip time for successful
19+
acknowledgement must be paid before new information even begins to transmit. This introduces a
20+
head-of-line-blocking effect, often paralyzing communication for any number of reasons such as
21+
implementation errors, denial-of-service exploitation, or common processing where latent network
22+
requests are often required to resolve a message to acceptance. During these events messages will
23+
continue to queue on an origin. Eventually this queue exceeds the limits for a single transaction thus
24+
requiring multiple rounds of transactions. These queuing events have been known to take days to
25+
resolve.
26+
27+
Many messages bundled in these tranches often have no dependency on each other. For example, the
28+
primary context division in Matrix is the Room: rooms have no specified interdependency: "transacting"
29+
messages from different rooms at the same time serves no purpose. It is purely a hazard. Worse, the
30+
primary unit of messaging for a room, the PDU, contains its own sequencing and reliability mechanism
31+
allowing it to exist fully independent of any transaction—as it virtually always does in every other
32+
context where PDU's are found. Sequencing PDU's in separate transactions is simply not necessary;
33+
purely a hazard.
34+
35+
The specification states: "A Transaction is meaningful only to the pair of homeservers that exchanged
36+
it; they are not globally-meaningful." This limited use and isolation eases our task to reduce or
37+
eliminate transactions entirely.
38+
39+
### Proposal
40+
41+
We specify `PUT /_matrix/federation/v2/send/{ EventId | EduId }` where events are sent
42+
indiscriminately. An `EduId` is an arbitrary string which MUST NOT be prefixed by `$`.
43+
44+
##### Unstable Prefix
45+
46+
`PUT /_matrix/federation/unstable/net.zemos.send/{ EventId | EduId }`
47+
48+
### Discussion
49+
50+
When used over modern HTTP/2 only a single connection is required to conduct an arbitrary number of
51+
concurrent transmissions. HTTP/1 systems can very safely utilize pipelining considering the
52+
idempotency of named PUT requests.
53+
54+
55+
### Alternatives
56+
57+
A possible alternative would be to keep the transaction structure while amending the protocol
58+
semantics for requisite conccurency in the modern age. Nevertheless the transaction structure has some
59+
defects for optimal network software. For example, network software benefits from transmitting the
60+
same message to multiple destinations without recrafting specific versions for each destination.
61+
62+
### Potential Issues
63+
64+
Some EDU's can exist naturally outside of transactions such as read-receipts which target a specific
65+
`event_id`, can be replayed, and can be received in any order. Nevertheless a wider analysis of
66+
transmitting EDU's indescriminately will have to be considered and some additional sequencing will
67+
likely be necessary in their payloads.
68+
69+
### Security Considerations

0 commit comments

Comments
 (0)