net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

a-lunev · 2022-02-22T22:16:41Z

Summary

The existing tcp_send_buffered implementation does a full rewind from the most recent sent segment back to the earliest not acknowledged one, thus many TCP segments are re-sent every time when TCP retrasmission timeout occurs.

According to RFC 6298 (5.4) only one the earliest not acknowledged segment should be retransmitted instead.

This PR implements TCP retrasmission according to RFC 6298 (5.4).
It is the same issue as it was in tcp_send_unbuffered.c (PR #4659) and tcp_sendfile.c (PR #5272).

Impact

TCP

Testing

Activate emulating packet loss on Linux host:
$ sudo iptables -A INPUT -p tcp --dport 31337 -m statistic --mode random --probability 0.01 -j DROP

Build NuttX:

$ ./tools/configure.sh -l sim:tcpblaster
$ make menuconfig
(disable CONFIG_NETUTILS_NETCAT_SENDFILE,
enable CONFIG_NET_TCP_WRITE_BUFFERS)
$ make

Enable TUN/TAP on Linux host:

$ sudo setcap cap_net_admin+ep ./nuttx
$ sudo ./tools/simhostroute.sh wlan0 on

Run netcat server on Linux host:
$ netcat -l -p 31337

Run NuttX on Linux host:

$ ./nuttx
NuttShell (NSH) NuttX-10.2.0
nsh> ifconfig eth0 10.0.1.2
nsh> ifup eth0
ifup eth0...OK

Start Wireshark (or tcpdump) and capture appeared tap0 interface.

Run in NuttX:

nsh> dd if=/dev/zero of=/tmp/test.bin count=1000
nsh> netcat LINUX_HOST_IP_ADDRESS 31337 /tmp/test.bin

Observe packet loss -> TCP retransmissions in TCP dump.

Shutdown NuttX:
nsh> poweroff

Disable TUN/TAP on Linux host:
$ sudo ./tools/simhostroute.sh wlan0 off

Deactivate emulating packet loss on Linux host:
$ sudo iptables -D INPUT -p tcp --dport 31337 -m statistic --mode random --probability 0.01 -j DROP

net/tcp/tcp_send_buffered.c

…segment (according to RFC 6298 (5.4)). The issue is the same as it was in tcp_send_unbuffered.c and tcp_sendfile.c.

anchao · 2022-02-23T17:40:46Z

net/tcp/tcp_send_buffered.c

-            {
-              conn->sent = 0;
-            }
+          return flags;


Could you please explain why it should be removed here?

emm... I am comment on the wrong line, remove line 715:
https://github.com/apache/incubator-nuttx/pull/5589/files/02c2b602c055214b111969e5ab207b77325f77c5#diff-ac1dd416b803b7f5b3118a05af87bdcef81e8b77110411711ec801bac928d274R715

flags will return on line 732:
https://github.com/apache/incubator-nuttx/pull/5589/files/02c2b602c055214b111969e5ab207b77325f77c5#diff-ac1dd416b803b7f5b3118a05af87bdcef81e8b77110411711ec801bac928d274R732

flags will return on line 732:

Then tcp_sendbuffer_notify(conn) will be invoked. Why its invocation is necessary if we are retransmitting a segment?

It depends on whether tcp_wrbuffer_release is called before tcp_sendbuffer_notify, it seems that you return flags early after release, which is different from the original code logic

Compared to the old code, the new code does not modify anything in write_q. Thus tcp_sendbuffer_notify(conn) invocation is not reasonable anymore if we are retransmitting a segment.
Except (++TCP_WBNRTX(wrb) >= TCP_MAXRTX) expiration case which logic is preserved.

Yes, you released the wrbuffer here without tcp_sendbuffer_notify on line 659

https://github.com/apache/incubator-nuttx/pull/5589/files/02c2b602c055214b111969e5ab207b77325f77c5#diff-ac1dd416b803b7f5b3118a05af87bdcef81e8b77110411711ec801bac928d274R659

anchao · 2022-02-23T17:46:59Z

net/tcp/tcp_send_buffered.c

       */

-      wrb = (FAR struct tcp_wrbuffer_s *)sq_peek(&conn->write_q);
-      ninfo("REXMIT: wrb=%p sent=%u\n", wrb, wrb ? TCP_WBSENT(wrb) : 0);
+      wrb = (FAR struct tcp_wrbuffer_s *)sq_peek(&conn->unacked_q);


Have you tested the situation of continuous packet loss? I'm not sure if such a change can cover all scenarios. In the case of multiple packets being continuously discarded, it is more efficient to retransmit all of them, especially for wired networks.

I have not tested continuous packet loss for this. As I can understand, a continuous packet loss is not a normal situation and it is subject to break the TCP connection.
Could you provide a link to RFC for retransmitting all the segments?

Continuous packet loss (1~3) is common case on wireless device. The current stack does not support selective-ack, so I suggest that similar changes should be as cautious as possible,
https://datatracker.ietf.org/doc/html/rfc5681 3.2.3

RFC 5681 3.2.3 says "The lost segment starting at SND.UNA MUST be retransmitted ...". Only one segment must be retransmitted.
I was thinking you meant continuous packet loss say 30 seconds (all segments are lost inside of say 30 seconds).
If you mean only 1 to 3 lost segments, then yes, when I was testing the patch using packet loss simulation, there were cases with several lost segments (..., good, lost, about 5 good, lost, ...), and the new algorithm worked just fine (compared to the existing implementation that floods with excessive retransmissions).
However, I did not see cases (during the packet loss simulation) with 2 or 3 lost segments in series such as: ..., good, lost, lost, good, .... Are you concerned namely with the case?

Yes, I don't deny the benefits of retransmitting the earliest packets, but if you can keep all retransmissions during the timer callback, I think it's more reasonable

Good. I will try to emulate namely that pattern and see how the stack behaves.

I was able to emulate a loss of two segments in a row as follows:

$ sudo iptables -A INPUT -p tcp --dport 5471 -m statistic --mode nth --packet 10 --every 1000000 -j DROP $ sudo iptables -A INPUT -p tcp --dport 5471 -m statistic --mode nth --packet 10 --every 1000000 -j DROP

I have tested tcp_send_unbuffered and tcp_sendfile for the case. These both modes perform fine:

Also I have tested tcp_send_buffered in case of two lost segments in a row.
Before PR #5589 the result is the following:

With applied PR #5589 the result is the following:

I agree with you, but your change will makes the situation worse in the case of continuous packet loss

As you can see on the screenshot above, the root cause of the exposed issue is that the existing Fast Retransmit algorithm (#2414) normally detects only the first lost segment, however it does not detect the second lost segment (the detector does not recover to the initial state).
I have fixed the issue of the Fast Retransmit algorithm and upgraded it from RFC 2001 to RFC 5681 (PR #5598).
Now if both PRs #5589 and #5598 are applied, everything properly works. The result is the following (for the case of two lost segments in a row):

see my comment on #5598

anchao · 2022-02-23T18:04:22Z

net/tcp/tcp_send_buffered.c

@@ -528,7 +527,7 @@ static uint16_t psock_send_eventhandler(FAR struct net_driver_s *dev,
                {
                  /* Do fast retransmit */

-                  rexmit = true;
+                  flags |= TCP_REXMIT;


I think the retransmission of the first unacked fragment should only be triggered in the case of fast retransmission. If the device calls back to notify TCP_REXMIT, then we keep the logic of all retransmissions, what do you think?

Concerning TCP protocol I am relying on RFCs only. Could you provide a link to RFC for retransmitting all the segments?

rfc5681 does not explicitly state that only first unacked segments are retransmitted. I think it is reasonable to retransmit only the earliest segment under fast retransmission. However, if the retransmission is triggered by tcp timer, I think the original logic can be maintained.
https://github.com/apache/incubator-nuttx/blob/master/net/tcp/tcp_timer.c#L352

E.g. RFC 5681 3.2.3 that you have referred says "The lost segment starting at SND.UNA MUST be retransmitted ...".
The word "segment" is not plural. That means only one segment.

RFC 6298 5.4 (that I have referred in the PR description) says:

When the retransmission timer expires, do the following:

(5.4) Retransmit the earliest segment that has not been acknowledged
by the TCP receiver.

All is written explicitly.

But only retransmitting the earliest unacked packet is not fully applicable to the current stack implementation, especially when several consecutive segments are dropped.
Do you have a real device with a wireless module? I think you can test your commit with different decays

But only retransmitting the earliest unacked packet is not fully applicable to the current stack implementation, especially when several consecutive segments are dropped.

Why?

Do you have a real device with a wireless module? I think you can test your commit with different decays

Yes, I do. I have tested tcp_send_buffered (this PR) and tcp_send_unbuffered (#4659 where I have fixed the same issue the same way) on real hardware. It worked just fine concerning TCP retransmissions. Before the applied patches streaming the data over TCP had serious lags (about 1 second) in case of TCP retransmissions, especially at high network traffic.

my suggestion is:

For the retransmission triggered by RTO, maintain the logic of all retransmissions:
https://github.com/apache/incubator-nuttx/blob/master/net/tcp/tcp_timer.c#L352

This is a safeguard to guarantee that all data will definitely be retransmitted.

If retransmission triggered by fast retransmission, follow RFC 6298 (5.4) only one of the earliest not acknowledged segment should be retransmitted

Improvement of this PR (#5589) , your test will most likely be in this case, if there is a problem with fast retransmission, then RTO will guarantee that all segments are retransmitted.

a-lunev force-pushed the tcp_send_buffered_rexmit_fix branch from 569f1e8 to 66b7e61 Compare February 22, 2022 22:20

xiaoxiang781216 requested a review from anchao February 23, 2022 03:21

acassis approved these changes Feb 23, 2022

View reviewed changes

pkarashchenko reviewed Feb 23, 2022

View reviewed changes

net/tcp/tcp_send_buffered.c Outdated Show resolved Hide resolved

net/tcp/tcp_send_buffered.c Outdated Show resolved Hide resolved

pkarashchenko reviewed Feb 23, 2022

View reviewed changes

net/tcp/tcp_send_buffered.c Outdated Show resolved Hide resolved

net/tcp(buffered): retransmit only one the earliest not acknowledged …

02c2b60

…segment (according to RFC 6298 (5.4)). The issue is the same as it was in tcp_send_unbuffered.c and tcp_sendfile.c.

a-lunev force-pushed the tcp_send_buffered_rexmit_fix branch from 66b7e61 to 02c2b60 Compare February 23, 2022 17:30

anchao reviewed Feb 23, 2022

View reviewed changes

anchao mentioned this pull request Jun 16, 2022

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #6451

Merged

a-lunev mentioned this pull request Jul 4, 2022

net/tcp: fix regression of invalid update the rexmit_seq in buffer mode #6562

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

a-lunev commented Feb 22, 2022 •

edited

Loading

anchao Feb 23, 2022

a-lunev Feb 23, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022 •

edited

Loading

anchao Feb 23, 2022

a-lunev Feb 23, 2022 •

edited

Loading

anchao Feb 23, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022 •

edited

Loading

anchao Feb 23, 2022

a-lunev Feb 23, 2022

a-lunev Feb 24, 2022

a-lunev Feb 24, 2022

a-lunev Feb 24, 2022 •

edited

Loading

anchao Feb 24, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022

anchao Feb 23, 2022

a-lunev Feb 23, 2022

anchao Feb 24, 2022

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

Are you sure you want to change the base?

net/tcp(buffered): retransmit only one the earliest not acknowledged segment #5589

Conversation

a-lunev commented Feb 22, 2022 • edited Loading

Summary

Impact

Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-lunev Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-lunev Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-lunev Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-lunev Feb 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-lunev commented Feb 22, 2022 •

edited

Loading

a-lunev Feb 23, 2022 •

edited

Loading

a-lunev Feb 23, 2022 •

edited

Loading

a-lunev Feb 23, 2022 •

edited

Loading

a-lunev Feb 24, 2022 •

edited

Loading