Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Connected state : timeout" and 107/125 errors delay 4-10 minutes if client and server are on different machines. #4

Closed
RobertoMalatesta opened this issue Oct 22, 2015 · 5 comments

Comments

@RobertoMalatesta
Copy link

UDT Version : latest from GIT
OS: Linux Kubuntu 2014.04 LTS
GCC: gcc version 4.9.2 (Ubuntu 4.9.2-0ubuntu1~14.04)

Connected State timeouts are generated only when udt_client(s) and udt_server are present on the same machine.
If udt_client is on another machine data transfer goes OK but if client or server are shut down (^C)
expected timeout/connection closed messages delay up to 10 minutes:

ie:
[2015-10-22 14:01:28.180869] [0x00007f2436342700] [trace] Accepted
^c is pressed on client machine, and error on receive appears after 6 minutes:
[2015-10-22 14:07:14.769052] [0x00007f2436342700] [trace] Connected state : timeout
[2015-10-22 14:07:14.769194] [0x00007f2436342700] [trace] Error on receive ec : 125 operation canceled

--R

@RobertoMalatesta RobertoMalatesta changed the title "Connected state : timeout" and 107/125 errors not present if client and server are on different machines. "Connected state : timeout" and 107/125 errors delay 4-10 minutes if client and server are on different machines. Oct 22, 2015
@securesocketfunneling
Copy link
Owner

Indeed, the delay seems to be weirdly too long.
We are going to make some tests in our side to see where the problem could be.

Thanks for the feedback.
We will keep you informed.

@securesocketfunneling
Copy link
Owner

The timeout is based on the estimation of the RTT.

RTT is highly dependant of the sending period and the current timer implementation (standard boost asio timer) is not as precise as we need (see issue #3).

A workaround for solving this issue is to implement a custom timer for sending data.
The branch feature/custom-timer includes such an implementation and then improves the global behaviour.

Would you mind compiling and testing the executables from this branch and make some feedbacks ?

@RobertoMalatesta
Copy link
Author

Negative.
Problem is still there.
[2015-10-26 18:39:21.188359] [0x00007f2248981700] [trace] Accepted
[2015-10-26 18:41:39.542833] [0x00007f2248981700] [trace] Connected state : timeout
[2015-10-26 18:41:39.542975] [0x00007f2248981700] [trace] Error on receive ec : 125 operation canceled

[2015-10-26 18:42:40.031357] [0x00007f224797f700] [trace] Accepted
[2015-10-26 18:50:38.625228] [0x00007f2248981700] [trace] Connected state : timeout
[2015-10-26 18:50:38.625372] [0x00007f2248981700] [trace] Error on receive ec : 125 operation canceled

Same problem is visible if client and server are on the same machine: each time one breaks the other detects the fault after about 20 seconds.

One other thing (maybe it's me not reading the specs): if I shut down the server and then restart it, shouldn't the connection silently resume instead of having to restart the client?

HTH

--R

@securesocketfunneling
Copy link
Owner

Negative.
Problem is still there.

Thanks for the testing and sorry that this testing patch did not fix your issue...

Same problem is visible if client and server are on the same machine: each time one breaks the   other detects the fault after about 20 seconds.

The timeout should not happen under 16 sec so 20 sec seems reasonnable.
For the remote problem, we may need more data. It is possible to activate some logging on client or server by changing a template parameter on the protocol.

using udt_protocol = ip::udt<connected_protocol::logger::FileLog<1000>>;

This will log the internal protocol variables each seconds in a file name session_log*.log and the result can be parsed with a python script available in the tools directory.

Would you mind sending us session_log files for analysis ?

One other thing (maybe it's me not reading the specs): if I shut down the server and then restart it,     shouldn't the connection silently resume instead of having to restart the client?

Well, stopping the server closes and resets all sessions bound to it. This reconnection behaviour should resides in the application layer and not in the app layer. UDT socket is like TCP socket, if you close the socket, the channel is down on both end and you need to reinitiate a connection to transfer data.
On the other hand, if this is a network issue (network down or congestion), the connection should be resumed automatically (as the session should be still alive until timeout).

@RobertoMalatesta
Copy link
Author

Closing this issue since it's been a long time that I'm not following it.

--R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants