|
| 1 | +# MSC4371: On the elimination of federation transactions |
| 2 | + |
| 3 | +Server Specification [v1.16 § 4](https://spec.matrix.org/v1.16/server-server-api/#transactions) |
| 4 | +(including all prior versions) defines an envelope structure accompanying a protocol for common message |
| 5 | +transport between servers referred to as "transactions." These structures collect messages queued by |
| 6 | +an origin for a destination, then transmitted, acknowledged by the destination, and then this process |
| 7 | +is repeated with new messages queued by the origin in the interim. |
| 8 | + |
| 9 | +Transactions have existed since the early protocol (circa 2014) when HTTP/1.1 was the common standard |
| 10 | +of transport. In HTTP/1 requests are processed sequentially within each connection. Multiple |
| 11 | +connections may be used for concurrent processing but a federation server will already be |
| 12 | +communicating to many destinations; minimizing connections between hosts is essential. Pipelining may |
| 13 | +also be used to hide latency but without explicit support by HTTP/1 there are many complications; |
| 14 | +protocol designers instead lean toward other solutions. From this environment federation transactions |
| 15 | +arose. |
| 16 | + |
| 17 | +Ironically transactions succumb to the same shortcomings as HTTP/1 itself. The Matrix protocol |
| 18 | +specifies that only one transaction can be in flight at a time. The round-trip time for successful |
| 19 | +acknowledgement must be paid before new information even begins to transmit. This introduces a |
| 20 | +head-of-line-blocking effect, often paralyzing communication for any number of reasons such as |
| 21 | +implementation errors, denial-of-service exploitation, or common processing where latent network |
| 22 | +requests are often required to resolve a message to acceptance. During these events messages will |
| 23 | +continue to queue on an origin. Eventually this queue exceeds the limits for a single transaction thus |
| 24 | +requiring multiple rounds of transactions. These queuing events have been known to take days to |
| 25 | +resolve. |
| 26 | + |
| 27 | +Many messages bundled in these tranches often have no dependency on each other. For example, the |
| 28 | +primary context division in Matrix is the Room: rooms have no specified interdependency: "transacting" |
| 29 | +messages from different rooms at the same time serves no purpose. It is purely a hazard. Worse, the |
| 30 | +primary unit of messaging for a room, the PDU, contains its own sequencing and reliability mechanism |
| 31 | +allowing it to exist fully independent of any transaction—as it virtually always does in every other |
| 32 | +context where PDU's are found. Sequencing PDU's in separate transactions is simply not necessary; |
| 33 | +purely a hazard. |
| 34 | + |
| 35 | +The specification states: "A Transaction is meaningful only to the pair of homeservers that exchanged |
| 36 | +it; they are not globally-meaningful." This limited use and isolation eases our task to reduce or |
| 37 | +eliminate transactions entirely. |
| 38 | + |
| 39 | +### Proposal |
| 40 | + |
| 41 | +We specify `PUT /_matrix/federation/v2/send/{ EventId | EduId }` where events are sent |
| 42 | +indiscriminately. An `EduId` is an arbitrary string which MUST NOT be prefixed by `$`. |
| 43 | + |
| 44 | +##### Unstable Prefix |
| 45 | + |
| 46 | + `PUT /_matrix/federation/unstable/net.zemos.send/{ EventId | EduId }` |
| 47 | + |
| 48 | +### Discussion |
| 49 | + |
| 50 | +When used over modern HTTP/2 only a single connection is required to conduct an arbitrary number of |
| 51 | +concurrent transmissions. HTTP/1 systems can very safely utilize pipelining considering the |
| 52 | +idempotency of named PUT requests. |
| 53 | + |
| 54 | + |
| 55 | +### Alternatives |
| 56 | + |
| 57 | +A possible alternative would be to keep the transaction structure while amending the protocol |
| 58 | +semantics for requisite conccurency in the modern age. Nevertheless the transaction structure has some |
| 59 | +defects for optimal network software. For example, network software benefits from transmitting the |
| 60 | +same message to multiple destinations without recrafting specific versions for each destination. |
| 61 | + |
| 62 | +### Potential Issues |
| 63 | + |
| 64 | +Some EDU's can exist naturally outside of transactions such as read-receipts which target a specific |
| 65 | +`event_id`, can be replayed, and can be received in any order. Nevertheless a wider analysis of |
| 66 | +transmitting EDU's indescriminately will have to be considered and some additional sequencing will |
| 67 | +likely be necessary in their payloads. |
| 68 | + |
| 69 | +### Security Considerations |
0 commit comments