-
Notifications
You must be signed in to change notification settings - Fork 17
Tx Request/Response protocol #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
| | | | ||
| -------------------- | --------------------------------------------------------------------------------------------- | | ||
| Owners | @PhilWindle | | ||
| Approvers | @just-mitch @alexghr @spalladino @Maddiaa0 | | ||
| Target Approval Date | 2025-06-10 | | ||
|
||
|
||
## Executive Summary | ||
|
||
This design attempts to define a more effective protocol for the retrieval of missing transactions from the network. | ||
|
||
|
||
## Introduction | ||
|
||
There is an outstanding problem with the block building protocol that we have no data availability solution. Node's mempools will naturally diverge and it is likely, particularly at higher transaction througputs that validators and provers won't have access to the transactions included in blocks proposed by the block proposer. | ||
|
||
The codebase currently contains a request/reponse mechanism for retrieving these transactions but it has not proven very effective. This design aims at improving on it. | ||
|
||
## Transaction Lifecycle | ||
|
||
For the purpose of network participants, transactions have a rough lifecycle. | ||
|
||
1. The transaction is stored as pending within the local mempool. During this time it may be subject to eviction, based on rules local to the node. We will refer to this state as PENDING. | ||
2. The transaction is included in a block proposal. At this point the transaction should not be evicted as it may be required for block validation. We will refer to this state as PROPOSED. | ||
3. The transaction is included in a mined block. All mined transactions are stored for a period of time. We will refer to this state as MINED. | ||
4. The transaction's block is proven or pruned. We will refer to this state as EXPIRED. | ||
|
||
Block proposals follow a similar lifecycle with the exception there is no such thing as a PENDING state for proposals. | ||
|
||
If a block proposal does not result in a mined block, the transactions within it will revert to PENDING. | ||
|
||
## Requirements | ||
|
||
The requirements of validators are that transactions can be retrieved quickly. The transactions need to be retrieved and the block needs to be re-executed in time for an attestation to be produced. Provers also require transactions for re-execution, their timeliness requirements are less strict as they essentially have 1 - 2 epochs to produce the required proofs. | ||
|
||
|
||
## Current Approach | ||
|
||
Every node on the network subscribes to block proposals. Upon receiving a block proposal, the node will instruct it's transaction pool to mark the transaction hashes as PROPOSED, non-evictable. The node will then make an attempt to request any missing transactions from it's peers on the network. All PROPOSED hashes are removed from the pool when any block is mined. | ||
|
||
The node makes a number of 'rounds' of requesting transactions. Each round sees it select a random subset of peers and ask each peer for a subset of missing transactions. Timeouts are specified for each round and globally. The peer that sent propagated the proposal to the peer (note: not the proposer of the block) is always included in the peer subset. | ||
|
||
The timeout values are arbitrarily set to 2 seconds and 8 seconds currently. The max number of peers selected for each round is a function of the number of transactions required. | ||
|
||
A request for a transaction is singular, 1 tx at a time. Requests to a given peer are performed serially so at any given time a single peer is only asked for a single transaction. Each request is a unique dial, request/response and hang-up operation. | ||
|
||
## Proposed approach | ||
|
||
As before, every node on the network subscribes to block proposals and marks transaction hashes so as not to evict those transactions from the pool. Transaction hashes remain marked until the end of the slot after the slot in which they were PROPOSED. This avoids race conditions where a proposal for slot n + 1 arrives before a node synced the block for slot n. The syncing of the block currently would remove the protection for transactions in the new block proposal. | ||
|
||
The `TxCollector` module will be modified to become a longer running task that can be thought of as permamnently making attempts to retrieve transactions from the network. It will dial and hold connections/streams to all connected peers. As stream/connection events happen it will re-attempt to establish connectivity and maintain available streams. | ||
|
||
Upon receipt of a block proposal, the `TxCollector` will be notified of the proposal and the transactions that need to be retrieved. It will continue to perform a series of message exchanges with all peers until the transactions are no longer required. | ||
|
||
Reason for the transactions no longer being required are: | ||
|
||
1. The proposal never made it into a mined block and the following slot has passed. The transaction transitioned back to PENDING. | ||
2. The block and it's transactions become EXPIRED. | ||
3. The transactions have been retrieved. | ||
|
||
### Block Tx Request/Response | ||
|
||
Instead of randomly selecting peers to query with random tx requests, the node will make frequent message exchanges with all of it's peers, these messages will be small and sent over previously established streams reducing latency. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is it safe to blast all peers so frequently? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps not. Maybe we should stagger it. We could send to a 1/4 of peers every 300ms or something, rotating of course. Would need to think about it. |
||
|
||
We introduce two intervals, the `proposedRequestInterval` and the `minedRequestInterval`, typically say 500ms and 2000ms respectively. | ||
|
||
Every `proposedRequestInterval`, the node makes an evaluation as to which block proposals it still requires transactions for and when it last enquired about a proposal. Queries will be made for proposals that are PROPOSED at this interval (provided txs are still required), proposals that are MINED will be queried at the less frequent `minedRequestInterval`. | ||
|
||
Peers are queried using `BlockTxRequests` messages. | ||
|
||
``` | ||
type BlockTxRequest = { | ||
slotNumber: number, | ||
blockHash: Buffer, // 32 byte hash of the proposed block header | ||
} | ||
|
||
type BlockTxRequests = { | ||
requests: BlockTxRequest[] | ||
} | ||
``` | ||
|
||
Upon receipt of a `BlockTxRequests` the peer will respond with a `BlockTxResponses`. | ||
|
||
``` | ||
type BlockTxResponse = { | ||
slotNumber: number, | ||
blockHash: Buffer, // 32 byte hash of the proposed block | ||
blockAvailable: boolean; // Whether the peer has the block available | ||
Comment on lines
+87
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reliable way to know to which msg the peer is replying to? If so, I'd remove these two fields, just to make messages smaller. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, not sure. To be honest though, even just a 32 bit message ID on a per connection basis would suffice. |
||
txIndices: Buffer, // BitVector indicating which txs from the proposal are available at the peer | ||
} | ||
|
||
type BlockTxResponses = { | ||
responses: BlockTxResponse[] | ||
} | ||
``` | ||
|
||
The frequent exchange of these messages enables nodes to build up mappings of where in their sets of peers transactions are available. These mappings will change rapidly as peers also implement the same transaction retrieval protocol. | ||
|
||
### Tx Request/Response | ||
|
||
Transactions are requested in batches using the `TxRequest` message. Additionally, `TxRequests` messages are used to request transactions from multiple blocks in a single request. | ||
|
||
``` | ||
type TxRequest = { | ||
slotNumber: number, | ||
blockHash: Buffer, // 32 byte hash of the proposed block | ||
txIndices: Buffer, // BitVector indicating which txs from the proposal are requested | ||
} | ||
Comment on lines
+104
to
+108
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we allow for batching multiple TxRequests, in case a node needs txs from more than a single block? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes we should, will update. |
||
``` | ||
|
||
``` | ||
type TxRequests = { | ||
requests: TxRequest[] | ||
} | ||
``` | ||
|
||
Using the mapping generated using the Block Tx Request/Response message exchange we intelligently request transactions from selected peers. | ||
|
||
1. Only make a single request to a peer at a time. | ||
2. Limit the number of transactions requested in a single request to a configurable `batchSize`. | ||
3. Allocate txs to peers such that we optimally retrieve all txs in the minimum number of requests and asking for the minimum number of txs from any given peer. | ||
|
||
Only making a single request to a peer with a limited number of transactions prevents a node from simply requesting all available transactions from the first peer to respond. Instead we should aim to spread the load as much as possible. | ||
|
||
## Protections | ||
|
||
Nodes will wish to protect themselves from malicious or faulty peers through peer-scoring. Punishments should be applied for: | ||
|
||
1. Providing invalid transactions. | ||
2. Providing transactions that do not match the requested block hash. | ||
3. Making `BlockTxRequest` requests for block proposals that do not exist, accounting for the fact that you may be slightly behind/ahead of the peer. | ||
4. Making too many requests in a given period of time. | ||
5. Making duplicate requests for transactions. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't they removed when the block is proven, not mined, so the tx remains available for provers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently no. That set of hashes I have described is just used to prevent pending tx eviction. Mined txs remain until their block is finalised.