-
Notifications
You must be signed in to change notification settings - Fork 28
Increased the connect and read timeouts. #1286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Increased the connect and read timeouts. #1286
Conversation
BenjaminPelletier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These settings can have their default values overridden by a particular test configuration using a QueryBehaviorResource (just define an instance of that resource in the top-level pool and it will mutate the global settings for the duration of the test run). Making these timeouts longer often makes tests take longer so we ideally want the default values as low as possible without having appreciable timeouts when the request was legitimately still processing and would have returned successfully given a bit more time. Could you provide examples of connections taking 3-10 seconds and reads taking 6-20 seconds?
|
This PR is proposing to change the default connect and read timeouts -- those are the durations of time allowed for all HTTP(S) connection attempts to connect to a remote host and to receive a complete response from the remote host. I'm not totally sure, but it sounds like the issue may be that injecting RID flight data takes longer than this for your particular USS because other workflows (flight auth) have to be completed since this is effectively a flight planning activity. If that interpretation is correct, the underlying problem is essentially that the RID injection API effectively makes injection a synchronous activity rather than asynchronous, and therefore the synchronous activity is subject to single-HTTP(S)-call timing limits. I think the best long-term fix to this underlying problem will be to use the flight_planning API to inject flight data (deprecating the RID injection API) and make the flight_planning API asynchronous. That is not expected in the near term, so we still also have to deal with the current situation in the near term. The best way to fix that underlying problem currently is to provide a QueryBehaviorResource in the test configuration that changes these values for just test runs using that test configuration (per comment above). (and, actually, only the read timeout should need to be adjusted since taking a while to complete the task shouldn't affect how long it takes to connect to the remote server) Providing that resource in a test configuration produces the same behavior as the changes requested in this PR for everyone using the test configuration in question. Making the changes requested in this PR, however, extends far beyond people using that test configuration -- it extends the timeouts for everyone using uss_qualifier for all purposes in all use cases for all calls to all servers. Choosing good timeout values is a tricky balance since values that are too short lead to spurious timeouts, but values that are too long lead to unacceptably-long test runs. Adjusting the timeouts via a QueryBehaviorResource means that these tradeoffs only need to be discussed amongst the people using the test configuration. Adjusting the default timeouts in the repo (as in this PR) means that these tradeoffs need to be discussed amongst all the people using uss_qualifier for all use cases and situations. InterUSS has already received feedback that our previous values were too long for many use cases, so it would be difficult to decide on a substantially longer default value for these timeouts (as requested in this PR). Is there a reason use of a QueryBehaviorResource wouldn't solve the problem? If so, is the above understanding correct? If correct, could you elaborate on why the connection timeout also needs to be longer (in addition to the read timeout)? If a QueryBehaviorResource wouldn't solve the problem and/or the above understanding isn't correct, could you provide an example sequence view artifact showing the false timeout to make sure we're on the same page regarding the problem? |

We have increase the connect and read timeouts to address out timeout issues.