-
Notifications
You must be signed in to change notification settings - Fork 6.3k
8368625: com/sun/net/httpserver/ServerStopTerminationTest.java fails intermittently in tier7 #27670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ils intermittently in tier7
👋 Welcome back myankelevich! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@myankelev The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
The extending of the stop duration may not have the desired effect of eliminating what appears to be a race condition in the test One of the recorded failures is for the temporal condition
This is a somewhat dubious constraint and can’t always be met The test assumes that the participating threads are all actively executing simultaneously, which may not be true Restating the objective of the test Potential race scenario: Server started creates exchange which waits for a signal to complete Exchange completion thread starts waits for 1 seconds before it can signal exchange to compete — the complete signal will be thus invoked sometime after 1 second depending on OS thread scheduling Main thread continues and invokes a stop which has a max wait time of 5 seconds for extant exchanges to complete — so max completion time of stop is 5 seconds ++ i.e. could be slightly longer than 5 seconds again subject to OS scheduling Temporal condition imposed by test
Second condition is that the elapsed time of the stop should be less than or equal to the stop delay. BUT if the full timeout for the server.stop will wait a max of N (5) seconds before terminating as per
Then this infers that the extant exchanges have taken longer than the expected or allowed time. This in turn infers that All in all the temporal conditions are not exact and they are subject to variability depending on the scheduling of threads by the OS. BUT extending of the delayDuration doesn’t necessarily impact the first condition, which has failed, because the elapsed time may be less than the exchange duration due to OS scheduling |
I believe Mark is right - there is an issue with the timing logic in this method. The virtual thread that sleeps before calling complete.countDown() may finish sleeping and call countDown() before server.stop() is called. Good analysis @msheppar ! |
|
||
// Complete the exchange one second into the future | ||
final Duration exchangeDuration = Duration.ofSeconds(1); | ||
final long startTime = System.nanoTime(); // taking custom start time just in case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final long startTime = System.nanoTime(); // taking custom start time just in case | |
// taking start time before entering completeExchange to account for possible | |
// delays in reaching server.stop(). | |
final long startTime = System.nanoTime(); |
|
||
// Complete the exchange 10 second into the future. | ||
// Runs in parallel, so won't block the server stop | ||
final Duration exchangeDuration = Duration.ofSeconds(Utils.adjustTimeout(10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but if this test fail we could envisage bumping that delay though. A better implementation could be to complete the exchange after exiting from server.stop() - without using any virtual thread, and just verifying that the server.stop() waited at least for 1s.
I believe that increasing the timeout might help, as it seems to be happening due to the machine load. I'm going to make a pr increasing the timeout to 20 from 5 (similar to what it was when timeout factor was 4).
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27670/head:pull/27670
$ git checkout pull/27670
Update a local copy of the PR:
$ git checkout pull/27670
$ git pull https://git.openjdk.org/jdk.git pull/27670/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27670
View PR using the GUI difftool:
$ git pr show -t 27670
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27670.diff
Using Webrev
Link to Webrev Comment