-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not close FDs 0, 1, or 2 #186
base: main
Are you sure you want to change the base?
Conversation
29a0392
to
e123ed7
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #186 +/- ##
==========================================
+ Coverage 78.84% 78.96% +0.11%
==========================================
Files 55 56 +1
Lines 10146 10183 +37
==========================================
+ Hits 8000 8041 +41
+ Misses 2146 2142 -4 ☔ View full report in Codecov by Sentry. |
Codecov appears to not be testing what happens in the child process after fork() and the error path of “cannot open |
AFAIR unit tests do not cover the PAM handling part, as they are not running as a system service, test runners don't have necessary PAM configuration etc. |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025021202-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025020404-4.3&flavor=update
Failed tests51 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/127852#dependencies 30 fixed
Unstable tests52 tests
|
This is the only qrexec PR in this test run, so the above failures seems to be regression caused by this change. |
e123ed7
to
c7a7826
Compare
libqrexec/libqrexec-utils.h
Outdated
@@ -162,6 +162,11 @@ void buffer_append(struct buffer *b, const char *data, int len); | |||
void buffer_remove(struct buffer *b, int len); | |||
int buffer_len(struct buffer *b); | |||
void *buffer_data(struct buffer *b); | |||
/* Open /dev/null and keep it from being closed before the exec func is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it simpler (and safer) to simply open /dev/null just before dup-ing it over 0,1,2 (in the child process already)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s definitely simpler, but I didn’t want the extra call to open()
. That said, giving each child process the same open file description is less than great, as it introduces shared state that should not be there. I don’t know if /dev/null
has any such state, but still it isn’t great, so I went ahead and switched to your approach.
But also, I question usefulness of this PR as a whole - the closing of standard FDs happens in a process that has a single purpose - wait for the child process and then exit, in the very same function as closing happens. There are few PAM cleanup calls, but it's very unlikely for them to be a problem (especially, it isn't a problem now, or for the last 10 or so years). |
Whether or not the last commit in the PR is merged, I definitely think the other commits should be merged. In particular, it turned out that the “close the FD” functionality had no unit tests because the unit tests took a codepath that was too different from the production code. This PR makes the production and test code follow the same path, with the result that the actual bug (
PAM cleanup calls into PAM modules, so it can do anything. I suspect Qubes OS only gets away with it because we have a fairly simple PAM stack by default. PAM cleanup is used for e.g. unmounting filesystems and closing encrypted volumes. The best approach would be for PAM to run with stdin pointed at |
Indeed I was talking about the last commit (which until the last force-push was the only commit in this PR).
Aren't PAM modules expected handle proper logging themselves? I don't think they are supposed to touch calling process's stdin/out/err in any case. And if they would do, that likely would interfere also with cases where they aren't closed (and then replaced with with unrelated thing) - for example it could interfere with an application log file on stderr that is expected in a specific format (different than PAM messages). |
As for the other commits, won't that have some non-trivial conflicts with #141 (which I hope is quite close to merge-able state)? |
I can include them in #141 or rebase this PR on top of it. I can also close this PR if you prefer, but I’d prefer that at least the bug fixes and testability changes go in. |
c7a7826
to
4b941e6
Compare
PipelineRetry |
One of the tests fails, I'll retry just in case, but it might be real bug |
Bug is real, but this code is just the messenger: #141 uses a shell script to run the command in the no-fork-server case but not otherwise. However, The proper solution is to write an extremely trivial C program ( |
24ef8b1
to
141160d
Compare
Previously 'qrexec-agent --fork-server-socket' (no argument) would segfault.
This is the convention used by the rest of qrexec. This commit should be backported to stable branches.
These should never happen, but call exit() if they do.
Saves an (admittedly cheap) system call.
No functional change intended.
This will be used by tests later. No functional change intended.
This will be used by tests later. No functional change intended.
This also fixes a bug: basename can mutate its argument, so a copy must be passed to it.
141160d
to
fa8b528
Compare
The reason for using shell there was to load relevant environment variables (wrong PATH is what we noticed, but surely it applies to many others too, including various XDG_, DBUS_, locale etc). So, just dropping the shell will re-introduce the issue. |
Ah, I see what you mean. Ok, I guess that should work. |
char **env, const char *cmd, pid_t agent_pid) | ||
{ | ||
/* 1 means init, which is completely wrong. */ | ||
if (agent_pid < 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it wrong? I can imagine legitimate case of using qrexec as init (maybe some version of a stubdomain, or some other very specialized domain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Qrexec forks a worker process for each connection, so the process ID to signal (to get stdout and stdin on the same stream) is never 1. If getppid()
returns 1 it means the qrexec agent died and the best thing to do is to exit.
agent/qrexec-agent.c
Outdated
_exit(QREXEC_EXIT_PROBLEM); | ||
} | ||
const char *buf[] = {"sh", "-lc", "exec \"$@\"", "sh", exec_program, pid_str, prog, cmd, service_path, NULL}; | ||
/* TODO: use the user's shell not /bin/sh */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it may sounds like a smart move, it's also problematic, as the script might need to be adjusted for such shell. AFAIR "user shell" doesn't even need to be POSIX compliant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
black complains |
This makes the unit test code more like the actual code used by end-users, and therefore makes the tests more accurate. This trips a bug in the code which will be fixed later, requiring a test to be changed to compensate.
If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead. Also statically initialize trigger_fd to -1, which is the conventional value for an invalid file descriptor. This requires care to avoid closing the file descriptor to /dev/null in fix_fds(), which took over an hour to debug.
This makes executing a program via the shell consistent with doing so _not_ via the shell: both handle the environment the same way, and both produce the same error codes if the program is bad. This also significantly simplifiex exec_qubes_rpc2(), which doesn't need to handle creating a shell command anymore. It is assumed that the user's startup files do not modify the positional parameters outside of a function. Doing so would be extremely buggy and break the R4.2 multiplexer. The new code does use spaces in shell script arguments, but if the user's startup files break when that happens, the user has other problems. This is an ABI break for libqrexec, but that's okay in an unstable release. In the future, libqrexec should really be a static library. Fixes: 8728002 (Stop using qubes-rpc-multiplexer for service calls)
fa8b528
to
61a0261
Compare
A lot of tests has failed, see openqa report from the bot |
I’ll work on fixing this. |
If they are closed, another file descriptor could be created with these numbers, and so standard library functions that use them might write to an unwanted place. dup2() a file descriptor to /dev/null over them instead.