-
Notifications
You must be signed in to change notification settings - Fork 375
Remove sentinel context where we log #18870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Relevant comment introductions: - matrix-org/synapse#5609 - 067b00d
``` 2025-08-26 18:40:27,996 - my.synapse.linux.server - synapse.app.homeserver - 187 - WARNING - main - Starting daemon. 2025-08-26 18:40:27,996 - my.synapse.linux.server - synapse.app.homeserver - 181 - WARNING - atexit - Stopping daemon. ```
Running a normal server (`daemonize: false`): ``` poetry run synapse_homeserver --config-path homeserver.yaml ``` Bad logs being seen: ``` PreserveLoggingContext: Expected logging context sentinel but found main ``` ``` LoggingContext: Expected logging context main was lost ```
This reverts commit fce59b2.
Resulting in bad logs being seen: ``` PreserveLoggingContext: Expected logging context sentinel but found main ``` ``` LoggingContext: Expected logging context main was lost ```
# make sure that we run the reactor with the sentinel log context, | ||
# otherwise other PreserveLoggingContext instances will get confused | ||
# and complain when they see the logcontext arbitrarily swapping | ||
# between the sentinel and `run` logcontexts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was added in 067b00d#diff-6e21d6a61b2f6b6f2d4ce961991ba7f27e83605f927eaa4b19d2c46a975a96c1R460-R463 (matrix-org/synapse#2027)
I can't tell exactly what's it's referring to but the one spot I was seeing Expected logging context sentinel but found main
around run_as_background_process(...)
usage has been fixed (#18870 (comment)).
Reproduction:
poetry run synapse_homeserver --config-path homeserver.yaml
- Ctrl + C to stop the server
- Notice
LoggingContext: Expected logging context main was lost
in the logs
# We also need to drop the logcontext before forking if we're daemonizing, | ||
# otherwise the cputime metrics get confused about the per-thread resource usage | ||
# appearing to go backwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was added in matrix-org/synapse#5609 and is accurately describing a real problem.
But we can be way more precise about what do to here. We only need to stop
the current log context right before forking the process and start
it again right after. This can be accomplished with the PreserveLoggingContext
being moved strictly around the os.fork()
call (search for os.fork()
below).
Previously, this caused the sentinel
context to be used for a whole bunch of logs (basically anything outside of a request). We now get the main
context used 💪
# Note that we return a Deferred here so that it can be used in a | ||
# looping_call and other places that expect a Deferred. | ||
return defer.ensureDeferred(run()) | ||
return make_deferred_yieldable(defer.ensureDeferred(run())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As self-described in the doc comment here, previously, this didn't follow the Synapse log context rules which caused a whole bunch of downstream havoc. Previously, this seemed to be masked because we had everything run in the sentinel
context.
By downstream havoc, I mean Expected logging context sentinel but found main
and Expected logging context main was lost
type warnings which means we're leaking and losing contexts which are being erroneously used by the next thing the reactor does.
Even though it was called out explicitly for not following the Synapse log context rules, I don't think that has any specific benefit and makes sense to update it in any case.
Reproduction:
poetry run synapse_homeserver --config-path homeserver.yaml
- Ctrl + C to stop the server
- Notice
LoggingContext: Expected logging context main was lost
in the logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This distinction actually seems to be important because (TODO)
It makes things get stuck:
poetry run synapse_homeserver --config-path homeserver.yaml
- This works fine:
curl http://localhost:8008/_matrix/client/versions
- But this endpoint just never responds for example:
curl 'http://localhost:8008/_matrix/client/v3/rooms/!FlgJYGQKAIvAscfBhq:my.synapse.linux.server/state/m.room.create/' -H 'Authorization: Bearer syt_bG9jYWwtZ3V5_mfWAFyxPnCnTRQjLiAyx_11rr8F'
Reproduction instructions: 1. `poetry run synapse_homeserver --config-path homeserver.yaml` 1. `curl http://localhost:8008/_matrix/client/versions` 1. Stop Synapse (`Ctrl + c`) Notice the bad log: ``` synapse.logging.context - WARNING - sentinel - LoggingContext: Expected logging context main was lost ```
…text rules" This reverts commit 675d94a. Things get stuck with this, see #18870 (comment)
Reproduction: 1. `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Ctrl + C to stop the server 1. Notice `LoggingContext: Expected logging context main was lost` in the logs
# We also need to drop the logcontext before forking if we're daemonizing, | ||
# otherwise the cputime metrics get confused about the per-thread resource usage | ||
# appearing to go backwards. | ||
with PreserveLoggingContext(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to a quick pairing session with @erikjohnston, the key insight was that removing PreserveLoggingContext
here was causing problems down the line 😇
I removed PreserveLoggingContext
here in order to maintain the main
logging context further into Synapse starting up and running. But this also meant leaking the main
logging context into the Twisted reactor event loop which causes it to be applied to the next task. This appeared like the desired result because the next task was start
but also meant it got messed up later down the line (I assume this is what was happening).
I was on the right line of thinking by removing PreserveLoggingContext
here since a) we do have some logs here that should have the correct context and b) has some bad interactions with daemonize_process(...)
as I was finding. And while we don't need PreserveLoggingContext
here specifically, instead, we only need it around the run_command()
which is reactor.run()
by default. See c1996b6
To get the main
context further into the process, we probably need to re-establish the logging context where we register_start
/start
. Still need to explore this thread ⏩
…ement-hq/synapse into madlittlemods/remove-sentinel-context
Remove
sentinel
context where we logPrerequisite for #18868
We shouldn't ever be logging something from the server with the
sentinel
logging context. We want to know which server generated the logs (c.f. #18868).Logging with the
sentinel
context was previously used as a crutch and get out of jail free card in some tricky scenarios. This PR aims to fix all of the cases where we see thesentinel
context being used for a log. Thesentinel
context is now only used to ensure we're not leaking contexts when yielding control back to the reactor which would erroneously be picked up by other requests and tasks going on in the reactor.Testing strategy
daemonize: true
:poetry run synapse_homeserver --config-path homeserver.yaml
Expected logging context sentinel but found main
Expected logging context main was lost
Expected previous context
utime went backwards!
/stime went backwards!
Called stop on logcontext POST-0 without recording a start rusage
sentinel
contextDev notes
Whenever we yield to the Twisted reactor (event loop), we need to set the
sentinel
log context so log contexts don't leak and apply to the next task.Synapse log context docs:
docs/log_contexts.md
This is part of tracing though:
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.