Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

a-lunev
Copy link
Contributor

@a-lunev a-lunev commented Feb 22, 2022

Summary

The existing tcp_send_buffered implementation does a full rewind from the most recent sent segment back to the earliest not acknowledged one, thus many TCP segments are re-sent every time when TCP retrasmission timeout occurs.

According to RFC 6298 (5.4) only one the earliest not acknowledged segment should be retransmitted instead.

This PR implements TCP retrasmission according to RFC 6298 (5.4).
It is the same issue as it was in tcp_send_unbuffered.c (PR #4659) and tcp_sendfile.c (PR #5272).

Impact

TCP

Testing

Activate emulating packet loss on Linux host:
$ sudo iptables -A INPUT -p tcp --dport 31337 -m statistic --mode random --probability 0.01 -j DROP

Build NuttX:

$ ./tools/configure.sh -l sim:tcpblaster
$ make menuconfig
(disable CONFIG_NETUTILS_NETCAT_SENDFILE,
enable CONFIG_NET_TCP_WRITE_BUFFERS)
$ make

Enable TUN/TAP on Linux host:

$ sudo setcap cap_net_admin+ep ./nuttx
$ sudo ./tools/simhostroute.sh wlan0 on

Run netcat server on Linux host:
$ netcat -l -p 31337

Run NuttX on Linux host:

$ ./nuttx
NuttShell (NSH) NuttX-10.2.0
nsh> ifconfig eth0 10.0.1.2
nsh> ifup eth0
ifup eth0...OK

Start Wireshark (or tcpdump) and capture appeared tap0 interface.

Run in NuttX:

nsh> dd if=/dev/zero of=/tmp/test.bin count=1000
nsh> netcat LINUX_HOST_IP_ADDRESS 31337 /tmp/test.bin

Observe packet loss -> TCP retransmissions in TCP dump.

Shutdown NuttX:
nsh> poweroff

Disable TUN/TAP on Linux host:
$ sudo ./tools/simhostroute.sh wlan0 off

Deactivate emulating packet loss on Linux host:
$ sudo iptables -D INPUT -p tcp --dport 31337 -m statistic --mode random --probability 0.01 -j DROP

@a-lunev a-lunev force-pushed the tcp_send_buffered_rexmit_fix branch from 569f1e8 to 66b7e61 Compare February 22, 2022 22:20
…segment

(according to RFC 6298 (5.4)). The issue is the same as it was in tcp_send_unbuffered.c and tcp_sendfile.c.
@a-lunev a-lunev force-pushed the tcp_send_buffered_rexmit_fix branch from 66b7e61 to 02c2b60 Compare February 23, 2022 17:30
{
conn->sent = 0;
}
return flags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why it should be removed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@a-lunev a-lunev Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flags will return on line 732:

Then tcp_sendbuffer_notify(conn) will be invoked. Why its invocation is necessary if we are retransmitting a segment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on whether tcp_wrbuffer_release is called before tcp_sendbuffer_notify, it seems that you return flags early after release, which is different from the original code logic

Copy link
Contributor Author

@a-lunev a-lunev Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to the old code, the new code does not modify anything in write_q. Thus tcp_sendbuffer_notify(conn) invocation is not reasonable anymore if we are retransmitting a segment.
Except (++TCP_WBNRTX(wrb) >= TCP_MAXRTX) expiration case which logic is preserved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*/

wrb = (FAR struct tcp_wrbuffer_s *)sq_peek(&conn->write_q);
ninfo("REXMIT: wrb=%p sent=%u\n", wrb, wrb ? TCP_WBSENT(wrb) : 0);
wrb = (FAR struct tcp_wrbuffer_s *)sq_peek(&conn->unacked_q);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested the situation of continuous packet loss? I'm not sure if such a change can cover all scenarios. In the case of multiple packets being continuously discarded, it is more efficient to retransmit all of them, especially for wired networks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested continuous packet loss for this. As I can understand, a continuous packet loss is not a normal situation and it is subject to break the TCP connection.
Could you provide a link to RFC for retransmitting all the segments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuous packet loss (1~3) is common case on wireless device. The current stack does not support selective-ack, so I suggest that similar changes should be as cautious as possible,
https://datatracker.ietf.org/doc/html/rfc5681 3.2.3

Copy link
Contributor Author

@a-lunev a-lunev Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC 5681 3.2.3 says "The lost segment starting at SND.UNA MUST be retransmitted ...". Only one segment must be retransmitted.
I was thinking you meant continuous packet loss say 30 seconds (all segments are lost inside of say 30 seconds).
If you mean only 1 to 3 lost segments, then yes, when I was testing the patch using packet loss simulation, there were cases with several lost segments (..., good, lost, about 5 good, lost, ...), and the new algorithm worked just fine (compared to the existing implementation that floods with excessive retransmissions).
However, I did not see cases (during the packet loss simulation) with 2 or 3 lost segments in series such as: ..., good, lost, lost, good, .... Are you concerned namely with the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't deny the benefits of retransmitting the earliest packets, but if you can keep all retransmissions during the timer callback, I think it's more reasonable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I will try to emulate namely that pattern and see how the stack behaves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to emulate a loss of two segments in a row as follows:

$ sudo iptables -A INPUT -p tcp --dport 5471 -m statistic --mode nth --packet 10 --every 1000000 -j DROP
$ sudo iptables -A INPUT -p tcp --dport 5471 -m statistic --mode nth --packet 10 --every 1000000 -j DROP

I have tested tcp_send_unbuffered and tcp_sendfile for the case. These both modes perform fine:
tcp_send_unbuffered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I have tested tcp_send_buffered in case of two lost segments in a row.
Before PR #5589 the result is the following:
tcp_send_buffered_rfc_2001_flooding

With applied PR #5589 the result is the following:
tcp_send_buffered_PR5589_rfc_2001

Copy link
Contributor Author

@a-lunev a-lunev Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, but your change will makes the situation worse in the case of continuous packet loss

As you can see on the screenshot above, the root cause of the exposed issue is that the existing Fast Retransmit algorithm (#2414) normally detects only the first lost segment, however it does not detect the second lost segment (the detector does not recover to the initial state).
I have fixed the issue of the Fast Retransmit algorithm and upgraded it from RFC 2001 to RFC 5681 (PR #5598).
Now if both PRs #5589 and #5598 are applied, everything properly works. The result is the following (for the case of two lost segments in a row):

tcp_send_buffered_PR5589_rfc_5681

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment on #5598

@@ -528,7 +527,7 @@ static uint16_t psock_send_eventhandler(FAR struct net_driver_s *dev,
{
/* Do fast retransmit */

rexmit = true;
flags |= TCP_REXMIT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the retransmission of the first unacked fragment should only be triggered in the case of fast retransmission. If the device calls back to notify TCP_REXMIT, then we keep the logic of all retransmissions, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning TCP protocol I am relying on RFCs only. Could you provide a link to RFC for retransmitting all the segments?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rfc5681 does not explicitly state that only first unacked segments are retransmitted. I think it is reasonable to retransmit only the earliest segment under fast retransmission. However, if the retransmission is triggered by tcp timer, I think the original logic can be maintained.
https://github.com/apache/incubator-nuttx/blob/master/net/tcp/tcp_timer.c#L352

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. RFC 5681 3.2.3 that you have referred says "The lost segment starting at SND.UNA MUST be retransmitted ...".
The word "segment" is not plural. That means only one segment.

RFC 6298 5.4 (that I have referred in the PR description) says:

When the retransmission timer expires, do the following:

(5.4) Retransmit the earliest segment that has not been acknowledged
by the TCP receiver.

All is written explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But only retransmitting the earliest unacked packet is not fully applicable to the current stack implementation, especially when several consecutive segments are dropped.
Do you have a real device with a wireless module? I think you can test your commit with different decays

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But only retransmitting the earliest unacked packet is not fully applicable to the current stack implementation, especially when several consecutive segments are dropped.

Why?

Do you have a real device with a wireless module? I think you can test your commit with different decays

Yes, I do. I have tested tcp_send_buffered (this PR) and tcp_send_unbuffered (#4659 where I have fixed the same issue the same way) on real hardware. It worked just fine concerning TCP retransmissions. Before the applied patches streaming the data over TCP had serious lags (about 1 second) in case of TCP retransmissions, especially at high network traffic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my suggestion is:

  1. For the retransmission triggered by RTO, maintain the logic of all retransmissions:
    https://github.com/apache/incubator-nuttx/blob/master/net/tcp/tcp_timer.c#L352

This is a safeguard to guarantee that all data will definitely be retransmitted.

  1. If retransmission triggered by fast retransmission, follow RFC 6298 (5.4) only one of the earliest not acknowledged segment should be retransmitted

Improvement of this PR (#5589) , your test will most likely be in this case, if there is a problem with fast retransmission, then RTO will guarantee that all segments are retransmitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants