Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Fix an issue where stopping the service immediately after startup could leave the processes #4782

Merged
merged 3 commits into from
Jan 28, 2025

Conversation

daipom
Copy link
Contributor

@daipom daipom commented Jan 27, 2025

Which issue(s) this PR fixes:

What this PR does / why we need it:
Add retry for stop event for Windows Service to fix #3937.

If Event.open() (OpenEvent) is called before the Event.new() (CreateEvent), Event.open() raises Errno::ENOENT.
This causes the service to be stopped while the supervisor and worker process remains.
It causes #3937.
This PR fixes it by adding retry.

ev = Win32::Event.open(event_name)

{win32_event: Win32::Event.new("#{@signame}"), action: :stop},

Docs Changes:
Not needed.

Release Note:
It would be good to have both of the following.

  • Windows: Fixed an issue where stopping the service immediately after startup could leave the processes.
  • Windows: Fixed an issue where stopping service sometimes can not be completed forever.

Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom
Copy link
Contributor Author

daipom commented Jan 28, 2025

Test

  • fluent-package v5.0.2
  • Windows 10 Home

Before

The supervisor and worker processes remain if stopping the service immediately after startup.

Powershell

$ Start-Service fluentdwinsvc; Stop-Service fluentdwinsvc
$ (ps -Name ruby).length
2

After

The supervisor and worker processes stop correctly if stopping the service immediately after startup.
I confirmed there are no anomalies in Fluentd logs.

Powershell

$ Start-Service fluentdwinsvc; Stop-Service fluentdwinsvc
$ (ps -Name ruby).length
0

@daipom
Copy link
Contributor Author

daipom commented Jan 28, 2025

Test: If max_retries(30) is exceeded, give up stopping the supervisor and stop the service.

Edit the code so that it takes 30 seconds to create the Event.

--- a/lib/fluent/supervisor.rb
+++ b/lib/fluent/supervisor.rb
@@ -59,6 +59,7 @@ module Fluent
       end

       if Fluent.windows?
+        sleep 30
         install_windows_event_handler
       else
         install_supervisor_signal_handlers

Powershell

$ Start-Service fluentdwinsvc; Stop-Service fluentdwinsvc
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
警告: サービス 'Fluentd Windows Service (fluentdwinsvc)' の停止を待っています...
$ (ps -Name ruby).length
2
$ Get-Service fluentdwinsvc

Status   Name               DisplayName
------   ----               -----------
Stopped  fluentdwinsvc      Fluentd Windows Service

@daipom daipom marked this pull request as ready for review January 28, 2025 02:38
@daipom daipom requested a review from kenhys January 28, 2025 02:39
@daipom daipom added this to the v1.19.0 milestone Jan 28, 2025
@daipom daipom added the backport to LTS We will backport this fix to the LTS branch label Jan 28, 2025
ashie
ashie previously approved these changes Jan 28, 2025
Copy link
Member

@ashie ashie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@kenhys kenhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix itself seems LGTM.
There is a nitpick about naming.

daipom and others added 2 commits January 28, 2025 13:15
Signed-off-by: Daijiro Fukuda <[email protected]>

Co-authored-by: Kentaro Hayashi <[email protected]>
Signed-off-by: Daijiro Fukuda <[email protected]>
Copy link
Contributor

@kenhys kenhys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kenhys kenhys merged commit 569c358 into fluent:master Jan 28, 2025
10 checks passed
@daipom daipom deleted the winsvc-retry-stop-event branch January 28, 2025 05:54
@daipom
Copy link
Contributor Author

daipom commented Jan 28, 2025

Thanks for your review!

kenhys added a commit that referenced this pull request Jan 29, 2025
…artup could leave the processes (#4782)

**Which issue(s) this PR fixes**:
* Fixes #3937

**What this PR does / why we need it**:
Add retry for stop event for Windows Service to fix #3937.

If `Event.open()`
([OpenEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-openeventw))
is called before the `Event.new()`
([CreateEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createeventw)),
`Event.open()` raises `Errno::ENOENT`.
This causes the service to be stopped while the supervisor and worker
process remains.
It causes #3937.
This PR fixes it by adding retry.

https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/winsvc.rb#L90

https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/supervisor.rb#L299

**Docs Changes**:
Not needed.

**Release Note**:
It would be good to have both of the following.

* Windows: Fixed an issue where stopping the service immediately after
startup could leave the processes.
* Windows: Fixed an issue where stopping service sometimes can not be
completed forever.

---------

Signed-off-by: Daijiro Fukuda <[email protected]>
Co-authored-by: Kentaro Hayashi <[email protected]>
Signed-off-by: Kentaro Hayashi <[email protected]>
daipom added a commit that referenced this pull request Jan 29, 2025
…diately after startup could leave the processes (#4782) (#4802)

**Which issue(s) this PR fixes**:

Backport #4782 

**What this PR does / why we need it**:

Add retry for stop event for Windows Service to fix #3937.

If `Event.open()`

([OpenEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-openeventw))
is called before the `Event.new()`

([CreateEvent](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-createeventw)),
`Event.open()` raises `Errno::ENOENT`.
This causes the service to be stopped while the supervisor and worker
process remains.
It causes #3937.
This PR fixes it by adding retry.


https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/winsvc.rb#L90


https://github.com/fluent/fluentd/blob/30c3ce00ff165b1b5d9f53fc0a027074bbcab0da/lib/fluent/supervisor.rb#L299

**Docs Changes**:
Not needed.

**Release Note**:
It would be good to have both of the following.

* Windows: Fixed an issue where stopping the service immediately after
startup could leave the processes.
* Windows: Fixed an issue where stopping service sometimes can not be
completed forever.

Signed-off-by: Daijiro Fukuda <[email protected]>
Signed-off-by: Kentaro Hayashi <[email protected]>
Co-authored-by: Daijiro Fukuda <[email protected]>
@daipom daipom mentioned this pull request Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport to LTS We will backport this fix to the LTS branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows Service: Stopping service sometimes can not be completed forever
3 participants