Skip to content

8351357: Add canary system test checking if Stage receives focus on show #1804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

lukostyra
Copy link
Contributor

@lukostyra lukostyra commented May 13, 2025

Originally this issue was supposed to resolve problems with some system tests (MenuDoubleShortcutTest, TextAreaBehaviorTest and others) failing on my Windows machine. In the process of figuring this out I found out the problem is Windows ::SetForegroundWindow() API refusing to give focus to JFX Stage upon calling Stage.show().

The problem happened only when running system tests via Gradle, and with more investigation it turned out the culprit is actually running tests via a Gradle Daemon, which is the default behavior. According to SetForegroundWindow API remarks there is a list of conditions a process must meet to be granted a privilege of receiving focus, which is supposed to prevent focus stealing. While we do meet the required conditions, we don't meet "one of" additional conditions listed in the reference:

  • Gradle daemon is a background process, so tests started by it do not meet The calling process is the foreground process. and The calling process was started by the foreground process. conditions
  • We most probably run the tests from the terminal, so There is currently no foreground window, and thus no foreground process. condition fails - the foreground window is the Terminal itself.
  • Each test has fresh-started JFX stage so The calling process received the last input event. condition cannot be met and would require either Robot workarounds or manual interaction before each test case.
  • There is no debugger involved in the process (at least most of the time) so Either the foreground process or the calling process is being debugged. is also not met.

As such, Windows refuses to grant JFX Stage focus, which fails some system tests relying on it.

While we cannot remedy these conditions in-code (outside of hacky solutions I found with AttachThreadInput API which I am not a fan of) the only solution seems to be running the tests on Windows via either gradle --no-daemon or by setting org.gradle.daemon=false property somewhere in gradle.properties.

In the process of debugging this problem I wrote a canary test to detect whether a Stage receives focus right after calling show(). I ran this test on all (accessible to me) platforms (Windows, Linux, macOS) - on both Linux and macOS the test passes regardless of whether the Gradle deamon is used or not. On my Windows machine (Win 11 24H2) it fails when testing through Gradle Daemon (among tests mentioned in the first paragraph) and succeeds when --no-daemon flag is used.

I figured the test could be added to the repository and serve as a helper if someone else encounters similar problems.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8351357: Add canary system test checking if Stage receives focus on show (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1804/head:pull/1804
$ git checkout pull/1804

Update a local copy of the PR:
$ git checkout pull/1804
$ git pull https://git.openjdk.org/jfx.git pull/1804/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1804

View PR using the GUI difftool:
$ git pr show -t 1804

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1804.diff

Using Webrev

Link to Webrev Comment

This test is used as a canary to make sure Stage has focus after showing on
screen. Failing this test means other tests might also fail.
@bridgekeeper
Copy link

bridgekeeper bot commented May 13, 2025

👋 Welcome back lkostyra! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 13, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot added the rfr Ready for review label May 13, 2025
@mlbridge
Copy link

mlbridge bot commented May 13, 2025

@kevinrushforth kevinrushforth self-requested a review May 19, 2025 13:48
@lukostyra lukostyra changed the title 8351357: Resolve system test focus issues 8351357: Add canary system test checking if Stage receives focus on show May 23, 2025
@lukostyra
Copy link
Contributor Author

Renamed the issue title to better reflect what the change actually does

Copy link
Member

@kevinrushforth kevinrushforth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test looks good and runs as expected on my Windows 11 system. I left two related questions about the delivery of the event. If the key press event might not be delivered synchronously from the robot key press call, then the test could fail spuriously. You might consider a couple changes to make the test more robust.

Util.runAndWait(() -> {
robot.keyPress(KeyCode.A);
});
assertTrue(receivedEvent, "Expected key press has NOT been received! Stage did not have focus after showing. Some tests might fail because of this." +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to sleep before the assertion to ensure that the event has been delivered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we ought to have a utility (latch?) to ensure the sequence?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A delay of some sort is necessary to give the event loop time to process the event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think Andy was suggesting a latch (with a timeout) that would be triggered (counted down to 0) when the event was received. This would also remove the concern I raised about the flag not being volatile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a latch I think we don't even need the boolean value (we can simply wait for latch to trigger) so I removed it

@Test
public void testStageHasFocusAfterShow() {
Util.sleep(250);
Util.runAndWait(() -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the stage isn't focused after the sleep the robot's key press might go to some other app entirely. You might want to add an assert after the sleep checking that stage.isFocused() returns true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this and it actually is a bug, isFocused() returns true in this case (when the Stage is not focused). We don't capture the failure of SetForegroundWindow(), so JFX assumes we are in focus. It doesn't necessarily change the outcome of the test, the key press check will then fail, but it would be good for the flag to also reflect that.

I'll add the assertion and file a separate issue for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could paint the window in some unusual color and check the screen pixels to make sure the window is on top, in addition to being focused?

// check if window is on top
Util.runAndWait(() -> {
Color color = getColor(STAGE_SIZE / 2, STAGE_SIZE / 2);
assertColorEquals(Color.LIGHTGREEN, color, TOLERANCE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this implementation is a reliable test: the stage in question may be overlapped by another window somewhere in the corner, right?

What would be a reliable test? Strictly speaking, we must check every pixel in the scene, though I wonder if checking each pixel in a grid (maybe 20 x 20, since we don't expect any reasonable window to be less than that) should be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most reliable way would be to probably scan the whole grid, I agree... I could change this - the stage is not that big after all. Maybe even turn the stage to undecorated to make this more reliable across platforms. I expect the very edge of the Stage to be not exactly the same color, but if we leave ex. 1px margin around the edges it should still be fine. I'll play around with this and update soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too - but why wouldn't the client area, the scene itself, be of different color? The decorations and possible effects should be outside of the stage, shouldn't they?

Copy link
Contributor Author

@lukostyra lukostyra Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turning the scene to undecorated would just make the math easier, without having to count in the window decorations which almost certainly would be of different color.

As for the margin thing, I had to do a bit of history searching and I came back with this old PR: #1242

I think when writing the above my blurry memory recalled one of the attempts at solving this which involved just skipping the 1px margin of the Stage on HiDPI systems. I agree, it should be fine just going through the entire Stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually a small correction, using VisualTestBase already assumes the Stage we get is undecorated. So no changes need to be made there.

Copy link
Contributor

@andy-goryachev-oracle andy-goryachev-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left some minor suggestions.

theStage.addEventHandler(WindowEvent.WINDOW_SHOWN, e -> {
Platform.runLater(() -> {
theStage.setX(STAGE_X);
theStage.setY(STAGE_Y);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting Stage (x,y) coordinates in runLater makes the window jump.
it's probably better to move LL98-99 before runLater, to L96

with the robot disabled, the test fails with expected

org.opentest4j.AssertionFailedError: Event received latch timed out! Stage most probably did not have focus after showing. Some tests might fail because of this. If that is the case, try re-running the tests with '--no-daemon' flag in Gradle. ==> expected: <true> but was: <false>
	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)

@beldenfox
Copy link
Contributor

I'm having an issue with this test. I can't get it to fail on my Windows 11 box but I can get other tests to fail due to focus issues.

I'm working on a system keyboard test (see attached patch for a pared down version). On my Windows 11 box the focus does not go to the stage unless I invoke the Robot mouse-click workaround. If I don't use that workaround the key event is going to the terminal window instead of the stage.

In that same environment the StageFocusTest in this PR always succeeds. I've never seen it fail. I've run these tests from both a Cygwin terminal and a MSYS2 MINGW terminal using bash and sh and StageFocusTest always passes.

I also notice that even when my test case fails the window is still on top, it just doesn't have focus. So the color testing in this PR seems unnecessary (?) Are you seeing cases where the stage is below other windows when it shows up?

ShortcutKeyboardTest.patch

@lukostyra
Copy link
Contributor Author

I'm having an issue with this test. I can't get it to fail on my Windows 11 box but I can get other tests to fail due to focus issues.

I started to see this problem as well while working on JDK-8359899 (the Stage.isFocused discrepancy). I have a suspicion that switching to VisualTestBase caused some sort of difference - VisualTestBase creates new Stages separately and doesn't take into account the primary Stage that you would get via overriding Application.start(). I noticed your test does not use VisualTestBase so it might be it.

I have not looked into a solution yet outside of the obvious - reverting to not using VisualTestBase. I will get that figured out very soon with Andy's review suggestions.

I also notice that even when my test case fails the window is still on top, it just doesn't have focus. So the color testing in this PR seems unnecessary (?) Are you seeing cases where the stage is below other windows when it shows up?

I did have situations where the Stage would appear under currently active window, sometimes even under the terminal window that runs the test.

lukostyra added 2 commits July 3, 2025 16:02
This is to check if Stage showing works also for Stages other than the
primary Stage provided by Application.start()
@lukostyra
Copy link
Contributor Author

I did a couple of extra changes in addition to the regular fixes:

  • extend VisualTestBase was removed, it was indeed the cause of the test always passing. The "Windows forbids granting focus" scenario seems to only happen when we try and show the Stage provided by Application.start()
  • Tests are now parameterized and check both the primaryStage delivered by Application.start() as well as a separately created Stage. Figured it might be worth testing as well, since it seems to behave slightly differently and can affect other test.

I did notice this morning that on a fresh system start, the test can pass even with --daemon gradle flag. This happens when it is the first invocation of Gradle that run - then the used Gradle daemon instance is a child of current Terminal process (which is a foreground process) which makes Windows treat it as a foreground process as well and grant focus to its children. The most reliable way I found of reproducing this and getting the test to fail on my Windows 11 box (also has to be done after calling gradle --stop):

  • Run one build using gradle --daemon
  • Close the Terminal window - that will cut the daemon's association with the Terminal and make Windows treat it as a background process
  • Re-open the terminal and re-run the tests using gradle --daemon

For me the test then fails when checking the primary Stage delivered by Application.start() at the keystroke part of the test.

Copy link
Contributor

@beldenfox beldenfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine. I left a comment about providing a help message when the color sampling test fails.

What does it mean to integrate this? We would be adding a test designed to fail with the default testing environment. After failure a developer would have to re-run with the no daemon flag set. Shouldn't that become the default? If not, can we ensure this test runs early so it truly serves as a canary test? Should folks developing new tests assume that they don't have to add the usual workarounds like sending a mouse click to the window?

FWIW, adding the no daemon flag ensures that this test succeeds consistently with Cygwin. With the Mingw shell turning off the daemon increases the chance that it will succeed but doesn't guarantee it.

@lukostyra
Copy link
Contributor Author

What does it mean to integrate this? We would be adding a test designed to fail with the default testing environment. After failure a developer would have to re-run with the no daemon flag set. Shouldn't that become the default?

I agree that this way of running the tests should be the default but there is no way (as far as I know) to do this from the project level. The only path would be to add a gradle.properties file in the project root but we don't have one and it is ignored via gitignore. We only have a template which you need to copy-paste and rename to gradle.properties. As I understand it, different developers might need different ways of configuring gradle for their own needs, which should not be pushed to the repository or tracked by Git as meaningful changes.

According to Gradle manual other ways to disable the daemon are user-specific (--no-daemon flag, user-wide or system-wide property).

If not, can we ensure this test runs early so it truly serves as a canary test?

I am not sure if that is possible, the only thing coming to my mind is executing that test specifically via build.gradle before running the whole suite. I'd be happy to get suggestions if anyone else knows something (ex. from JUnit) that could help it.

Should folks developing new tests assume that they don't have to add the usual workarounds like sending a mouse click to the window?

Some tests did that, some tests did not. This whole ordeal started because MenuDoubleShortcutTest seemingly crashed mid-execution, later I found out the problem is Windows' focus stealing policy when you run the tests - test window was not in focus, the terminal was, so when the test program sent Ctrl+C for test purposes it was delivered to the terminal with obvious results. Before that was the case (and judging by how native calls like _setVisible behave) the assumption was that showing the Stage means it gets focus.

I have a patch queued up after this gets introduced which fixes the problem we talked about earlier with Window.isFocused() API being incorrect. Ideally I would hope that tests requiring sending some inputs to the Stage will call that function and fail early (like some other tests already do).

@beldenfox
Copy link
Contributor

Some tests did that, some tests did not. This whole ordeal started because MenuDoubleShortcutTest seemingly crashed mid-execution […]

Ack! I wrote that test! And I should know better. I'll make sure it gets focus and stops using Ctrl+C as a test case.

I'm sure I ran MenuDoubleShortcutTest on Windows before submitting it. I just tried it again and it worked fine. Once. The second time it failed. I wish we could find a formula which guarantees these tests always work OR that they always fail.

@lukostyra
Copy link
Contributor Author

I'm sure I ran MenuDoubleShortcutTest on Windows before submitting it.

This also started being a problem at some later point in time; on my machine tests used to do a full run with no failures with the Gradle daemon being enabled. I can't unfortunately pinpoint when the problems started happening, I only found out after the fact that SetForegroundWindow() enforces conditions mentioned in the PR description.

AFAIK --no-daemon consistently helps (I use MinGW on my Windows box). @kevinrushforth also mentioned offline that he disables the Gradle daemon in his machine-wide gradle.properties so he never encountered these problems.

I also submitted a (for now draft) PR for the isFocused() API fix - #1849. It is independent from this test but the test was helpful with checking the behavior while writing the test. Feedback is welcome!

@kevinrushforth
Copy link
Member

Yes, I always disable the gradle daemon in my $HOME/.gradle/gradle.properties. Our CI builds and test runs also do this (our Jenkins scripts pass --no-daemon to all invocations of gradle). We've had way too many problems over the years, and this is just another example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Ready for review
Development

Successfully merging this pull request may close these issues.

4 participants