Skip to content

GossipSub v1.4: Message preamble + IMReceiving notification to considerably reduce bandwidth & latency for large messages #654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ufarooqstatus
Copy link

@ufarooqstatus ufarooqstatus commented Dec 16, 2024

This extension considerably reduces bandwidth utilization and network-wide message dissemination time for large messages.

Problem with existing approach (GossipSub v1.2):

  1. Peers are unaware of the msgID during download and may generate many IWANT requests for the same message.
  2. Mesh members are not aware if a peer is receiving a message and may start sending the same message to that peer (IDONTWANT can only be transmitted after downloading the message)

Solution (Proposed extension):

  1. Prepend message preamble (carrying msgID + length) to large messages, to be processed immediately by the receiver.
  2. Receiver defers IWANT requests for messages it is receiving
  3. Limit outstanding IWANT requests for a large message to one (responding to IWANTs is mandatory)
  4. Receivers now use a new control message, called IMReceiving, to notify their mesh that they are in the process of receiving a message. So, the mesh peers defer sending that message.

More context available here

@ufarooqstatus
Copy link
Author

The results from experiments in a 1500 peer network. Bandwidth for each peer ranges between 50-150 Mbps. Latency for each link ranges between 40-130 ms. Bandwidth and latency are uniformly distributed in 5 stages. A total of 12 messages were transmitted. IDONTWANT message is used as a preamble.

Average duplicates reduced to less than 2
Significant reduction in latency as well
LatBW Graph

@vyzo
Copy link
Contributor

vyzo commented Dec 16, 2024

Shouldnt this be v1.3?

@ufarooqstatus
Copy link
Author

Shouldnt this be v1.3?

Actually, there was an open PR with v1.3, the idea was to set an appropriate number once its considered ready for merge

@vyzo
Copy link
Contributor

vyzo commented Dec 16, 2024

ok, fair enough.


The purpose of the preamble is to allow receivers to instantly learn about the incoming message.
The preamble must include the message ID and length,
providing receivers with immediate access to critical information about the incoming message.
Copy link

@nisdas nisdas Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue is that as of the protobuf schema is designed, you will have to download the whole message in order to access the preamble. If you look at how control messages are represented in the rpc message:
https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#protobuf
https://github.com/libp2p/specs/tree/master/pubsub#the-rpc

It is numbered after our full published message. So you would have to download the whole message before you can access the preamble.

Nvm, I misunderstood this. The preamble is a rpc message sent separately beforehand

### IMReceiving Message

The IMReceiving message serves a distinct purpose compared to the IDONTWANT message.
An IDONTWANT can only be transmitted after receiving the entire message.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the distinction here of IMReceiving compared with IDONTWANT and having this broadcasted earlier, how effective would this be in practice ? One issue we have seen is that an actual control message takes a while to be processed by the gossip router even after it has been received due to HOL blocking. So by the time you process the control message, the actual message might already be sent by your mesh peers.

Copy link

@kaiserd kaiserd Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the scenarios we tested, sending IMReceiving significantly increases the probability of mesh peers being able to stop unnecessary message sends since enough IMReceiving go through in time.
Still, definitely something we will look out for in our experiments, and check for scenarios where HOL might have a severe impact. We also have have Ethereum focused tests and analyses on the roadmap.

With QUIC as a transport and multiplexer, we can further reduce the HOL impact.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the scenarios we tested, sending IMReceiving significantly increases the probability of mesh peers being able to stop unnecessary message sends since enough IMReceiving go through in time.

Is there more information on the scenarios tested ? Ex: How many different topics nodes were subscribed to along with how many messages were being published per second on these topics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the distinction here of IMReceiving compared with IDONTWANT and having this broadcasted earlier, how effective would this be in practice ? One issue we have seen is that an actual control message takes a while to be processed by the gossip router even after it has been received due to HOL blocking. So by the time you process the control message, the actual message might already be sent by your mesh peers.

Yes, that is why we still see duplicates, averaging around 1.8 per peer in the network. Proper prioritization of preamble/IDONTWANTs should further lower the number of duplicates.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there more information on the scenarios tested ? Ex: How many different topics nodes were subscribed to along with how many messages were being published per second on these topics.

All (1500) peers were subscribed to a single topic. Twelve messages were introduced, each by a different publisher, with each publisher waiting 3 seconds before sending the next message. Messages larger than 600 KB take more time to reach all peers, building outgoing message queues at many peers.

Copy link
Author

@ufarooqstatus ufarooqstatus Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the results below, we consider 1500 peers (single topic) with an inter-message spacing of 50 ms, which is roughly 20 messages per second. The message size is 50 KB.
S3 lat BW graph

@cortze
Copy link
Contributor

cortze commented Dec 17, 2024

Peers are unaware of the msgID during download and may generate many IWANT requests for the same message.

The problem with this is that there is no limit on the number of IWANTs you can send for the same message. Thus, you send an IWANT to each of the nodes that send an IHAVE with a message ID that you haven't received (yet).

This should be limited to an alpha parameter like in the Kademlia DHT. It is a simple configuration that can already remove a spike in bandwidth utilisation in some edgy cases. We could even add a second configuration parameter for a wait time or a grace period for the number of milliseconds to wait before sending those IWANTS.

@kaiserd
Copy link

kaiserd commented Dec 18, 2024

Let me suggest this alternative using IDONTWANT instead of introducing the new IAMRECEIVING message:
(just to open this for discussion)

  • Peer A sends a preamble for a large message to B
  • Peer B sends IDONTWANT (instead of IAMRECEIVING) asap when receiving this preamble outside of the typical heartbeat interval
  • mesh peers receiving IDONTWANT cannot defer message sending as they cannot semantically distinguish this IDONTWANT from heartbeat IDONTWANTs, so they will simply stop sending
  • in case B does not receive the message B promised, B will descore A and sends an IWANT for the message

This requires another IWANT in case a message is not delivered.
This case should not happen too often though and can be handled by peer scoring.
It keeps the implementation simpler and does not introduce another message, but it adds to the semantics of IDONTWANT.

@ufarooqstatus
Copy link
Author

The problem with this is that there is no limit on the number of IWANTs you can send for the same message. Thus, you send an IWANT to each of the nodes that send an IHAVE with a message ID that you haven't received (yet).

Yes, that is one big issue!

This should be limited to an alpha parameter like in the Kademlia DHT. It is a simple configuration that can already remove a spike in bandwidth utilisation in some edgy cases. We could even add a second configuration parameter for a wait time or a grace period for the number of milliseconds to wait before sending those IWANTS.

Yes, this is part of the solution, but it also requires that replying to IWANT requests be made mandatory (at least for large messages), and preamble can further limit IWANT requests!

@cortze
Copy link
Contributor

cortze commented Dec 19, 2024

Yes, this is part of the solution, but it also requires that replying to IWANT requests be made mandatory (at least for large messages)

It is already "mandatory": not replying to a received IWANT message penalizes your score.
I'm not against the proposed upgrades, I like the direction. I'm just trying to point out that the current implementation has some low-hanging upgrades that don't change drastically the protocol but can also reduce unnecessary duplicates.

This should be limited to an alpha parameter like in the Kademlia DHT.

I'd be keen to have some small upgrades like this one before jumping into something bigger.

@ufarooqstatus
Copy link
Author

Let me suggest this alternative using IDONTWANT instead of introducing the new IAMRECEIVING message: (just to open this for discussion)

  • Peer A sends a preamble for a large message to B
  • Peer B sends IDONTWANT (instead of IAMRECEIVING) asap when receiving this preamble outside of the typical heartbeat interval
  • mesh peers receiving IDONTWANT cannot defer message sending as they cannot semantically distinguish this IDONTWANT from heartbeat IDONTWANTs, so they will simply stop sending
  • in case B does not receive the message B promised, B will descore A and sends an IWANT for the message

This requires another IWANT in case a message is not delivered. This case should not happen too often though and can be handled by peer scoring. It keeps the implementation simpler and does not introduce another message, but it adds to the semantics of IDONTWANT.

While the fundamental purpose of IDONTWANT messages is:
"Peer X on receiving an IDONTWANT from Y, knows that Y has already received the message, so sending it to Y is unnecessary."

However, the use of IDONTWANT messages can be tailored to serve any of the following two purposes:

  1. As Message Preamble: On receiving an IDONTWANT from Y, X can assume that Y will immediately forward the message to it, eliminating the need for a preamble.

  2. As IMReceiving: On receiving a message preamble, we consider it a definite promise, so IDONTWANT can be issued immediately, serving as IMReceiving. However, in this case, mesh members cannot find if the message was successfully received.

@Nashatyrev
Copy link
Contributor

This is interesting feature. We were doing experiments in a similar direction with @Menduist (here he is mentioning this potential feature)

To compare here are my simulation results for 3 options

  • No IDONTWANT (called 'choke' message there)
  • IDONTWANT when a message received (OnReceive option) (present spec 1.2)
  • IDONTWANT when a preamble received (OnNotify option)

The results are pretty close to yours and they are really impressive.

However there were some security concerns regarding this feature.

If I remember correctly the major security concern was kind of amplification attack: by sending a single preamble message an attacker would cause sending many IDONTWANT messages. But I'm not sure this is a significant problem, as the attacker is limited in sending next preamble before transferring the promissed message.

@ufarooqstatus
Copy link
Author

ufarooqstatus commented Apr 5, 2025

This is interesting feature. We were doing experiments in a similar direction with @Menduist (here he is mentioning this potential feature)

To compare here are my simulation results for 3 options

  • No IDONTWANT (called 'choke' message there)
  • IDONTWANT when a message received (OnReceive option) (present spec 1.2)
  • IDONTWANT when a preamble received (OnNotify option)

The results are pretty close to yours and they are really impressive.

However there were some security concerns regarding this feature.

If I remember correctly the major security concern was kind of amplification attack: by sending a single preamble message an attacker would cause sending many IDONTWANT messages. But I'm not sure this is a significant problem, as the attacker is limited in sending next preamble before transferring the promissed message.

Thank you @Nashatyrev for your encouraging feedback. It's great to see that both results are similar and indeed promising.

Yes, using "Preamble+IMRECEIVING" messages is similar to using "Notify+IDONTWANT_on_Notify". However, the use of IMRECEIVING can help achieve additional resilience against amplification attacks. Here's how it works:

  1. We send IMRECEIVING on receiving a preamble for message X.
  2. When a mesh member receives IMRECEIVING, it defers (not cancel) sending message X. The defer interval can be calculated based on the message size.
  3. The IDONTWANT message confirms the reception of message X, allowing mesh members to cancel sending X only after this confirmation.
  4. Additionally, we can limit the number of preamble announcements for unfinished transfers

Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to draft this out and simulate it. I have reviewed the spec, and I have a couple of concerns:

I think all our existing control messages are about behaviors between two peers. This preamble is the first message that affects the behavior between the remote peer and the remote peer's mesh. This makes me uneasy as it may be a new attack surface.

Here are just two attacks that I could think of that exploit this:
Remember than any peer can push a message to you. This is how flood publishing works.

  • A kind of eclipse attack:

    1. Malicious peer (not necessarily in your mesh) sends you a preamble with an important msg id that is time-sensitive.
      1. The malicious peer can also set a large (max) msg size here as well to force you to delay even more.
    2. You send your peers IMRECEIVING
    3. The malicious peer doesn't give you the data.
    4. Right as your estimated download timer expires, another different malicious peer sends you a preamble with the important msg id.
    5. You repeat from step 2. Each iteration causes a delay until the message has expired.
  • There may also be an amplification attack depending on the exact semantics of IMRECEIVING and IDONTWANT:

    1. Malicious peer (not necessarily in your mesh) sends you a preamble with a large enough msg size
    2. You send your peers IMRECEIVING
    3. The malicious peer lied about the message size, but the id is correct for the given message.
    4. Do you send IDONTWANT to your peers?
      1. You have the message referenced in the message id, so it's true you don't want them to send you that message id.
    5. If you do send IDONTWANT, then this message caused 2 control messages per peer.
    6. And what happens if this message is really small?
      1. I can force you to send (D-1)*2 control messages per message.
    7. Would downscoring help if I lied about the message size?
      1. maybe, but what if there were many malicious peers? This only costs them 1 small message.

You may be able to mitigate these somewhat by only accepting preambles from peers in your mesh.



I think figuring out the proper timeouts might be tricky. Too lenient and you amplify the problems above. Too strict and you may unnecessarily penalize honest peers. The proper timeout for your peers seems trickier. They need to estimate the bandwidth between you and some other unknown peer, and the RTT between them and you.

@ufarooqstatus
Copy link
Author

ufarooqstatus commented Apr 11, 2025

Thanks for taking the time to draft this out and simulate it. I have reviewed the spec, and I have a couple of concerns:

I think all our existing control messages are about behaviors between two peers. This preamble is the first message that affects the behavior between the remote peer and the remote peer's mesh. This makes me uneasy as it may be a new attack surface.

Here are just two attacks that I could think of that exploit this: Remember than any peer can push a message to you. This is how flood publishing works.

  • A kind of eclipse attack:

    1. Malicious peer (not necessarily in your mesh) sends you a preamble with an important msg id that is time-sensitive.

      1. The malicious peer can also set a large (max) msg size here as well to force you to delay even more.
    2. You send your peers IMRECEIVING

    3. The malicious peer doesn't give you the data.

    4. Right as your estimated download timer expires, another different malicious peer sends you a preamble with the important msg id.

    5. You repeat from step 2. Each iteration causes a delay until the message has expired.

  • There may also be an amplification attack depending on the exact semantics of IMRECEIVING and IDONTWANT:

    1. Malicious peer (not necessarily in your mesh) sends you a preamble with a large enough msg size

    2. You send your peers IMRECEIVING

    3. The malicious peer lied about the message size, but the id is correct for the given message.

    4. Do you send IDONTWANT to your peers?

      1. You have the message referenced in the message id, so it's true you don't want them to send you that message id.
    5. If you do send IDONTWANT, then this message caused 2 control messages per peer.

    6. And what happens if this message is really small?

      1. I can force you to send (D-1)*2 control messages per message.
    7. Would downscoring help if I lied about the message size?

      1. maybe, but what if there were many malicious peers? This only costs them 1 small message.

Hello @MarcoPolo, many thanks for reviewing this draft. Yes, we have already considered and addressed the highlighted concerns in this proposal. The draft remains open for any further suggestions or revisions.

To clarify the workflow outlined in this proposal, the proposed changes apply only to large messages. For smaller messages, we use standard GossipSub v1.2 operation (No preamble/IMRECEIVING needed).

Message forwarding for large messages

  1. A peer X sends preambles (immediately followed by actual message transmission) ONLY to its mesh members. The preamble acts as a promise to deliver the message within a reasonable timeframe.
  2. Similarly, a peer Y ONLY accepts preambles from its mesh members and sends IMRECEIVING notifications. IMRECEIVING means Y's mesh members should defer (not cancel) transmission of the advertised message ID.
  3. If the message receiving is complete at Y, it sends IDONTWANT. Otherwise, defer_interval expires, and Y's mesh members relay this message to Y.
  4. To control negative behavior, we limit the maximum number of outstanding preambles and descore peers for breaking promises.

IHAVE/IWANT processing for large messages

  1. Peer X sends a preamble before replying to an IWANT request from peer Z. Peer Z only uses this preamble solely to decide when to issue next IWANT request if X fails to complete its promise.
    This is done because under the current arrangement (starting from v1.1), peers send IWANT requests whenever they receive an IHAVE announcement for unseen messages. This overwhelms early message receivers (optimal path peers), as they get to share their bandwidth for mesh_transfers + many_IWANT_replies. This noticeably increases message dissemination time in the network. Some other measures are also possible for fixing this problem. The advantage of preamble (in this case) is that the receiver can estimate message download time, as IWANT replies travel through less frequent (low $C_{Wnd}$) paths, so it usually takes longer.

Regarding floodpublish for large messages
We dont use preamble for floodpublish!
Floodpublish is good for small messages. However, for large messages, the sender's bandwidth is shared for many simultaneous transfers, that indirectly delays retransmissions from mesh members. AFAIK, floodpublish is disabled by default in many implementations.

You may be able to mitigate these somewhat by only accepting preambles from peers in your mesh.

yes, this along with negative scoring, outstanding_preamble_limit, and defer_interval mitigates most of the problems

@ufarooqstatus
Copy link
Author

ufarooqstatus commented Apr 11, 2025

  • f you do send IDONTWANT, then this message caused 2 control messages per peer.

  • And what happens if this message is really small?

    1. I can force you to send (D-1)*2 control messages per message.

Talking about message count, we have approximately $\frac{D}{2} to D$ duplicates per peer for each message. The message size is proportional to the number of duplicates, as sending a large message may take considerable time (contention time), which leads to two fundamental problems:

  1. A receiver can send IDONTWANTs only after downloading the entire message. For a large download time, the receiver will likely start receiving the same message from multiple mesh members (this compromises the true effectiveness of IDONTWANT messages).
  2. During this period, peers already receiving a message also send IWANT requests (multiple IWANTS in v1.1??) for the same message.

For a 1MB message, we noticed up to D duplicates (with approximately two duplicates coming from IWANT replies). The proposed approach reduces duplicates to under 2, as the IMRECEIVING message almost eliminates message contention time.

Talking about preamble/IMReceiving transmission counts, we have approximately $N \times \frac{D}{2}$ transmissions in the network, each for the preamble and IMReceiving announcement. The presented bandwidth graph includes IMReceiving/Preamble announcement volume.

We can even use IDONTWANT messages as a preamble (provided that IDONTWANT announcement is immediately followed by the message transmission).

More context here
PoC implementation here

@ufarooqstatus
Copy link
Author

I think figuring out the proper timeouts might be tricky. Too lenient and you amplify the problems above. Too strict and you may unnecessarily penalize honest peers. The proper timeout for your peers seems trickier. They need to estimate the bandwidth between you and some other unknown peer, and the RTT between them and you.

Thanks for pointing this out. IMO, the initial preamble announcements are typically made by high-resource peers. So, adjusting timeouts based on the usual network averages can be sufficient. A per-topic lenient/aggressive strategy could also be a viable option.
This part is definitely open to feedback to ensure optimal protocol performance.

@Nashatyrev
Copy link
Contributor

We can even use IDONTWANT messages as a preamble (provided that IDONTWANT announcement is immediately followed by the message transmission).

IDONTWANT has a different semantics: there is no promise for the message to follow

@ufarooqstatus
Copy link
Author

IDONTWANT has a different semantics: there is no promise for the message to follow

Yes, that requires changing the IDONTWANT semantics to promise. That is why, a separate preamble is used here

@Nashatyrev
Copy link
Contributor

A kind of eclipse attack:
[....]

  • Right as your estimated download timer expires, another different malicious peer sends you a preamble with the important msg id.

IAMRECEIVING should probably be strictly one-shot message per messageId. I.e. we are saving throughput in a happy case, but quickly falling back to a regular flow if anything goes wrong

@Nashatyrev
Copy link
Contributor

Another concern which came to my mind is that slowing down message receiving for a remote peer could basically be done without score penalization: after sending a preamble the adversary may just slow down further message transfer which is basically acceptable (as remote peer outbound bandwidth may just be saturated).

Comment on lines 62 to 63
| `peer_preamble_announcements` | The maximum number of preamble announcements for unfinished transfers per peer | 1??? |
| `mesh_preamble_announcements` | The maximum number of preamble announcements to accept for unfinished transfers per topic per heartbeat interval | 3??? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need separate parameters for these limits? I would treat skipping a message body or supplying a message of a different length as a regular protocol violation (tracked with behaviourPenaltyWeight) That should downscore a bad peer pretty quickly.

| `peer_preamble_announcements` | The maximum number of preamble announcements for unfinished transfers per peer | 1??? |
| `mesh_preamble_announcements` | The maximum number of preamble announcements to accept for unfinished transfers per topic per heartbeat interval | 3??? |
| `max_iwant_requests` | The maximum number of simultaneous IWANT requests for a message | 1??? |
| `preamble_threshold` | The smallest message size to use message preamble | 200KB??? |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this to be parametrized in the spec? Could it be just an implementation decision?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need separate parameters for these limits? I would treat skipping a message body or supplying a message of a different length as a regular protocol violation (tracked with behaviourPenaltyWeight) That should downscore a bad peer pretty quickly.

Yes, that suffices with behavior penalties.
Limiting maximum unfinished (ongoing)transfers by one sender (peer_preamble_announcements) can provide added protection in some cases, especially when message counts/heartbeat_interval are high. But such peers will get swapped soon.
Mesh_preamble_announcements is probably not needed (will remove in next commit)

Do we really need this to be parametrized in the spec? Could it be just an implementation decision?

Just for feedback on reasonable defaults.

@MarcoPolo
Copy link
Contributor

In general, this idea makes a lot of sense. You can send a small control message to let peers know you're in the process of receiving a large message, and please don't send me more copies. This might even help benefit the "wait" strategies discussed here: https://ethresear.ch/t/pppt-fighting-the-gossipsub-overhead-with-push-pull-phase-transition/22118. As you can wait a much smaller amount to see if a peer sends you a IMRECEIVING message.

The biggest problems I see are:

  1. A peer needs to handle timers whose timeout is a function of a remote peer's RTT and remote peer's bandwidth with another unknown peer.
    1. A well behaved node needs to clear a peer's timer by sending IDONTWANTs.
  2. Message length can be lied about leading to possible attacks.
    1. e.g. You need to receive a small message within 1s. A malicious node in your mesh sends you a preamble with that message id, but lies about the size. You set large timeouts and miss the message.
    2. Mitigations are possible, but tricky.

But here are the things I like about this proposal:

  1. Preamble to let a node know what message ID you are about to send.
  2. A mechanism to let your mesh peers know what message ID you are in the process of receiving.

In an effort to improve the proposal and not just naysay, here is a rough outline of some small modifications that could fix the problems while maintaining the benefits.

On simplifying IMRECEIVING handling:

  1. A mesh peer sends a preamble with just the message ID followed by the message. No length, we'll cover this later.
  2. Upon receiving a preamble from a mesh peer, the peer sends IMRECEIVING with just the message ID to its mesh peers.
  3. IMRECEIVING does not trigger a receiving peer to set timeouts or defer work. Instead, it tells the receiving peer to switch to pull-based dissemination for the given message ID. In other words, when a peer receives IMRECEIVING from a mesh peer, it should send that mesh peer an IHAVE message (or semantically equivalent, but let us not get off-topic here) instead of forwarding the message directly.
  4. The sender of IMRECEIVING is solely responsible for setting up appropriate timeouts. If the first sender of the message is too slow or fails to deliver for whatever reason, the sender of IMRECEIVING should consult its state and request the message from another peer that has the message.

On setting timeouts and message length:

There's an easy way to do this today without any spec changes. You simply encode the message length as part of the message ID. This prevents peers lying about the size of a message.

Implementations SHOULD provide this message ID to a function that returns the timeout for receiving a given message. Implementations SHOULD also provide the remote peer's ID to this function. This timeout function MAY be implemented by the user and MAY use a node's estimated bandwidth along with other connection information related to the peer. Importantly, this does not have to be a part of the protocol spec. It is just a recommendation to implementors.

@ufarooqstatus
Copy link
Author

ufarooqstatus commented Apr 15, 2025

In general, this idea makes a lot of sense. You can send a small control message to let peers know you're in the process of receiving a large message, and please don't send me more copies. This might even help benefit the "wait" strategies discussed here: https://ethresear.ch/t/pppt-fighting-the-gossipsub-overhead-with-push-pull-phase-transition/22118. As you can wait a much smaller amount to see if a peer sends you a IMRECEIVING message.

Thank you @MarcoPolo, for bringing this up. I believe preamble/IMReceiving is a perfect fit for DAS. We are already doing some experiments on this.

The transmission time for preamble/IMReceiving is nearly negligible. So, link latency is the only contention time we face. This gives us two distinct advantages:

  1. As the message size increases, the variance in message arrival times at different peers also increases. When this variance exceeds the average link latency, very few duplicates happen. In fact, the number of duplicates decreases as the message size increases.
  2. A minimal wait time can help overcome this contention time (requires small modification though).

So, we can achieve extremely low duplicate counts for huge messages without compromising latency.

@ufarooqstatus
Copy link
Author

The biggest problems I see are:

  1. A peer needs to handle timers whose timeout is a function of a remote peer's RTT and remote peer's bandwidth with another unknown peer.

    1. A well behaved node needs to clear a peer's timer by sending IDONTWANTs.

Yes, this is the most challenging part to handle.
We only need one timer for each mesh member (for the oldest promised message).
In my opinion, it's not necessary to be precise with the timeout duration. Timeout will only serve as a temporary fallback mechanism to revert to GossipSub v1.2 behavior when peers misbehave (Such peers will soon get pruned). In many scenarios, computing timeout based on the average peer profile (bandwidth/latency) is sufficient. That said, fine-tuning this balance is tricky and can be left to implementation decisions.

  1. Message length can be lied about leading to possible attacks.

    1. e.g. You need to receive a small message within 1s. A malicious node in your mesh sends you a preamble with that message id, but lies about the size. You set large timeouts and miss the message.
    2. Mitigations are possible, but tricky.

I guess we can mitigate this issue. For instance, other mesh members may notice false-length advertisements from the IMReceiving announcement and fall back to GossipSub v1.2 for that message.

@ufarooqstatus
Copy link
Author

In an effort to improve the proposal and not just naysay, here is a rough outline of some small modifications that could fix the problems while maintaining the benefits.

On simplifying IMRECEIVING handling:

  1. A mesh peer sends a preamble with just the message ID followed by the message. No length, we'll cover this later.
  2. Upon receiving a preamble from a mesh peer, the peer sends IMRECEIVING with just the message ID to its mesh peers.
  3. IMRECEIVING does not trigger a receiving peer to set timeouts or defer work. Instead, it tells the receiving peer to switch to pull-based dissemination for the given message ID. In other words, when a peer receives IMRECEIVING from a mesh peer, it should send that mesh peer an IHAVE message (or semantically equivalent, but let us not get off-topic here) instead of forwarding the message directly.
  4. The sender of IMRECEIVING is solely responsible for setting up appropriate timeouts. If the first sender of the message is too slow or fails to deliver for whatever reason, the sender of IMRECEIVING should consult its state and request the message from another peer that has the message.

This is a good suggestion. Thank you for taking the time to propose this improvement.
In ideal conditions where all peers behave normally, push and pull modes are functionally similar. Also, the peers have all the information available to enable any of these modes.

when a peer receives IMRECEIVING from a mesh peer, it should send that mesh peer an IHAVE message

We still receive IDONTWANT messages from peers, so there's no strict need to send IHAVEs in response to IMReceiving. In this proposal, we can clarify that peers can send IWANT directly to mesh members.

  1. IMRECEIVING does not trigger a receiving peer to set timeouts or defer work. Instead, it tells the receiving peer to switch to pull-based dissemination

We can leave the push-pull transition choice to the application. One possible approach is to include a flag in the IMReceiving message to indicate mode selection (skipping the length field can also indicate pull mode selection).

@MarcoPolo
Copy link
Contributor

We still receive IDONTWANT messages from peers, so there's no strict need to send IHAVEs in response to IMReceiving. In this proposal, we can clarify that peers can send IWANT directly to mesh members.

Just to clarify my intent, a peer does not send an IHAVE in response to IMRECEIVING. It will send an IHAVE instead of pushing the message directly when forwarding a message to a peer that has sent IMRECEIVING. The difference here versus what is in this PR is the peer sends an IHAVE rather than sending nothing, setting a timer to defer work, and expecting a IDONTWANT.

@Nashatyrev
Copy link
Contributor

IMRECEIVING does not trigger a receiving peer to set timeouts or defer work. Instead, it tells the receiving peer to switch to pull-based dissemination for the given message ID.

Looks like a good option to me! Another option is to add the timeout parameter to IMRECEIVING message to make the sender responsible for estimations.

There's an easy way to do this today without any spec changes. You simply encode the message length as part of the message ID. This prevents peers lying about the size of a message.

Sounds like overcomplicating to me and I didn't get how this may prevent sending invalid size? Adversary may still send ID with invalid size encoded

@MarcoPolo
Copy link
Contributor

Sounds like overcomplicating to me and I didn't get how this may prevent sending invalid size? Adversary may still send ID with invalid size encoded

Yes, but they will be giving you an ID for something else. They can never give you an ID for a real message with a wrong size.

For example, say your ID function is simply msg_len_u64 + 32 bytes from sha256 (In reality you may want to use varint encoding for length). A message of length 512 bytes that hashes to hashA will have an id of 512hashA.

A malicious peer can lie about the message size, but that will result in a different ID. If they lie and send a preamble for 1024hashA that won't prevent an honest peer from sending the node the correct message for 512hashA (because the node only sent IMRECEIVING for 1024hashA).

Contrast this with not including the message length as part of the ID. A malicious peer can lie about the message size for hashA, the node will send IMRECEIVING to honest peers for hashA, and the node will fail to get the message with id hashA in a timely manner. Honest peers will respect the IMRECEIVING.

@Nashatyrev
Copy link
Contributor

@MarcoPolo I think I got you
However we don't take into account message size for other gossip operations so still modifying the messageId sounds like an overkill to me.
For our IMRECEIVING business we may just treat a tuple (messageId, size) as our virtual 'message ID'.
Whenever a node receives IMRECEIVING (messageId, size) and it have seen a message with id messageId but it was of a size different than size then just ignore that particular IMRECEIVING message.
From the other side if a node received PREAMBLE(messageId, size) and then it follows by a message with messageId but of a different size, then it should be treated as a protocol misbehavior and the lying peer should be significantly downscored

@Nashatyrev
Copy link
Contributor

I'm more of a concern about attack when a malicious node sends correct preamble and then slows down the actual message transfer.

It's enough to have just a single malicious node in your mesh to slow down every message. And it's not that complicated to 'infect' the meshes of the majority of nodes in the network and significantly slow down messages propagation globally.

@ufarooqstatus ufarooqstatus force-pushed the gossipsubv1_4_specs branch from 1b537be to 22d4da8 Compare May 1, 2025 15:36
@ufarooqstatus
Copy link
Author

ufarooqstatus commented May 1, 2025

@Nashatyrev , @MarcoPolo

I have updated the PR based on the feedback received. Key improvements include:

  1. Added a configurable fallback_mode parameter that allows selecting between pull-based or push-based message recovery strategies when a message transfer fails
  2. Included a safety strategy:
    • PREAMBLE/IMRECEIVING accepted only from mesh members
    • Peers ignore IMRECEIVING if message length is misrepresented.
    • PREAMBLE accepted only once for any message
    • Limit on the number of unfinished transfers per peer (peer_preamble_announcements)
  3. Elaborated more about PREAMBLE and IMRECEIVING processing, and behavioral penalties.

@ufarooqstatus
Copy link
Author

I'm more of a concern about attack when a malicious node sends correct preamble and then slows down the actual message transfer.

It's enough to have just a single malicious node in your mesh to slow down every message. And it's not that complicated to 'infect' the meshes of the majority of nodes in the network and significantly slow down messages propagation globally.

Many thanks for highlighting this important concern. We have tried to address it in the revised draft, and additional due diligence can further strengthen the defense mechanism.

For example, PREAMBLE comes from early message recipients. Typically, they are faster peers or, at least, they are better than the mesh average. Since there is no benefit in processing a PREAMBLE received from a slower peer, we can use observed peer performance to accept PREAMBLE only from peers that are better than the mesh average (mesh average profiling is trivial with PREAMBLE).
We also use the mesh average to estimate message transfer duration. Penalizing slow senders will descore/prune them.

@Nashatyrev
Copy link
Contributor

@ufarooqstatus tbh I still see no reliable mechanism to address slow transfer after preamble. All solutions look pretty fuzzy to me. You would either stay vulnerable or penalize good peers with casually saturated outbound bandwidth

To push forward the changes in this PR may be it could be better to:

  • split this PR to smaller changes: e.g. start with PREAMBLE only
  • make no strict assumptions about exact usage of the new messages, just potential implementation recommendations
  • for IMRECEIVING I would rely responsibility to sender on how to handle this message: the exact timeout or/and switching to pull regime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

7 participants