-
Notifications
You must be signed in to change notification settings - Fork 17
Tx Request/Response protocol #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal looks good, but I can't shake the feeling that we're reimplementing a variant of the IHAVE/IWANT messages from gossipsub as discussed here. Though I'm not sure if it'd be easier to reimplement ourselves than to piggyback on an existing protocol that's not exactly designed for this.
|
||
## Current Approach | ||
|
||
Every node on the network subscribes to block proposals. Upon receiving a block proposal, the node will instruct it's transaction pool to mark the transaction hashes as PROPOSED, non-evictable. The node will then make an attempt to request any missing transactions from it's peers on the network. All PROPOSED hashes are removed from the pool when any block is mined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All PROPOSED hashes are removed from the pool when any block is mined
Aren't they removed when the block is proven, not mined, so the tx remains available for provers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not currently no. That set of hashes I have described is just used to prevent pending tx eviction. Mined txs remain until their block is finalised.
type TxRequest = { | ||
slotNumber: number, | ||
blockHash: Buffer, // 32 byte hash of the proposed block | ||
txIndices: Buffer, // BitVector indicating which txs from the proposal are requested | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow for batching multiple TxRequests, in case a node needs txs from more than a single block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should, will update.
blockHash: Buffer, // 32 byte hash of the proposed block | ||
blockAvailable: boolean; // Whether the peer has the block available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reliable way to know to which msg the peer is replying to? If so, I'd remove these two fields, just to make messages smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not sure. To be honest though, even just a 32 bit message ID on a per connection basis would suffice.
|
||
### Block Tx Request/Response | ||
|
||
Instead of randomly selecting peers to query with random tx requests, the node will make frequent message exchanges with all of it's peers, these messages will be small and sent over previously established streams reducing latency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of it's peers
Is it safe to blast all peers so frequently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps not. Maybe we should stagger it. We could send to a 1/4 of peers every 300ms or something, rotating of course. Would need to think about it.
Essentially yes, we are to an extent. I'd be hesitant to try and somehow hook into gossipsub though. It works quite a bit differently as it's push-based, not pull based. Also it could be a headache upgrading libp2p etc if we modified or extended it. But I'm open to suggestions, I haven't done any investigation into the feasibility of adapting it. |
I agree with @spalladino's statement:
I think we should find a way to build on top of
But we'd be forking gossipsub, which isn't updated often, and updates don't look too big (history). My issues with this approach are:
On the other hand, leveraging
Downsides of my proposal (as I see them atm.):
|
One thing I wanted to bring up against using the IHAVE msgs I had proposed before: Ethereum's consensus layer uses reqresp for nodes to request missing blocks, not IHAVE/IWANT. |
I'm sorry if I haven't expressed myself clearly. I never intended not to use REQ/RESP, but to leverage IHAVE to know which peers to send REQ to (instead of asking them all to learn this info). My idea is along the lines:
|
Makes sense, thanks for clarifying @mralj! |
I think my main disagreement is around the correlation between the So, Node A receives transaction X, puts it into it's MCache then for a period of 3 heartbeats may send 30s go by and the network is busy. Transaction X has been evicted by both parties. Validator B receives a proposal containing transaction X and doesn't have it. My main concern here is that we would more often than not fall into your 3.1 scenario. After all, if a validator doesn't have the transaction, there is a fairly good chance that any peers that once said they have it no longer do. It feels like once we fall into the 'tx missing' scenario we should optimise for assuming that very few other people have it even if they once did. Given the time constraints that the validators are working to, any req/resp protocol needs to be aggressive. Maybe I am misunderstanding your approach. |
No, you are not :)
Fair enough. I don't have a good argument that the scenario you are describing won't be the real-world scenario, i.e., the scenario that actually happens in practice. One more detail regarding the following:
I don't think the MCache here is important at all, but the TXPool, as long as Node A has it in the pool, it can deliver TX as a response. |
To be clear. I don't know for sure by any means. I'm hypothesising |
CommentsAs you mention, the most important thing is that validators have all transactions as fast as possible. So I think our KPI is "latency from proposer initiating a propagation to e.g. 80% of validators having all transactions", and we know we need that to be under 2 seconds. Can we get some median and P95 measurements/estimates for both the status quo and this approach? Also, can we get an estimate on how long it would take to build this? An alternative designI think the best solution here if we are optimizing for that KPI is to ensure that the validators already have the transactions in the proposal. We can do this by running a consensus protocol over the mempool itself. In this case, a proposer cannot propose a transaction unless it has a certificate that at least X>66% of the committee indeed has the transaction. So we frontload the work of adding the transaction to mempool onto the client: their request to submit the transaction doesn't return until it receives this certificate. Only after X% of committee members signatures are observed on p2p do nodes "actually commit" the transaction to their mempool, and the request from the user returns. Note also that we don't need to explicitly identify validators by their IP in this case (though I am personally completely fine with explicitly tying the two together- I don't think the privacy gained by divorcing the two is substantial)- they could just sign a message with their attester key saying "i have transaction X", gossip that message back out, and a recipient could verify that the signer is in the committee for the current epoch. There are some edge cases around submitting a transaction at the epoch boundary, and what happens to confirmed transactions that take longer than an epoch to get included, but they seem surmountable. Generally, I think that putting better guarantees on the mempool is a better solution than retroactively trying to recover transactions; I think having a "repair/recover" procedure is needed, but at scale and in practice, we should strive to never need it. |
I think this is quite similar to an idea put forward by Joe. Essentially asking validators to commit to their mempools, which I like a lot in principle. What's unclear to me however is how to enforce that validators sign all of the transactions they see and don't for example, only sign low value transactions in an attempt to prevent others from including high value ones. The network would also end up being very gossipy. Assuming 100% honest committees, every transaction would presumably generate an additional 48 messages to be broadcasted. |
Thinking about it, unless my maths is completely wrong, at 10TPS and with a peering degree of 8, every node is required to transmit ~4000 messages per second! |
@PhilWindle I think the easiest thing would be to slash proposers who include transactions without an availability cert. I'm not sure about the bandwidth usage. I would think it would not be so bad beyond what we have today, because each message should mainly be a signature and a transaction hash. So it should certainly be less than 256 bytes per message. So even if they are transmitting 4000 messages per second, the total bandwidth used should be under 10 Mbps. |
No description provided.