Protect metadata shutdown endpoint#422
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Code Review
This pull request introduces a secure shutdown token mechanism for the SCION metadata server to protect the /_scion/shutdown endpoint. Feedback on these changes highlights several critical issues: a race condition where a stopping server can delete a newly started server's token file, a compilation failure on Windows due to syscall.O_NOFOLLOW, a potential timing attack in token verification, a file sharing violation on Windows when deleting an open file, and a local DoS vulnerability from storing the token in the world-writable os.TempDir().
| if s.shutdownTokenPath != "" { | ||
| if err := os.Remove(s.shutdownTokenPath); err != nil && !errors.Is(err, os.ErrNotExist) { | ||
| log.Debug("Failed to remove metadata shutdown token file %s: %v", s.shutdownTokenPath, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
There is a race condition between a stopping server instance (srv1) and a starting server instance (srv2) reclaiming the port.
When srv1.Stop() is called, s.srv.Shutdown(shutdownCtx) immediately closes the listener. This allows srv2 to successfully bind to the port and write its new token file via ensureShutdownToken(). However, srv1's Stop() continues executing and deletes the token file at s.shutdownTokenPath after srv2 has already written its new token. This leaves srv2 running without a token file, preventing any future instances from shutting it down.
To fix this, delete the token file before calling s.srv.Shutdown(shutdownCtx).
| if err := os.Remove(path); err != nil && !errors.Is(err, os.ErrNotExist) { | ||
| return err | ||
| } | ||
| f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_EXCL|syscall.O_NOFOLLOW, 0600) |
There was a problem hiding this comment.
syscall.O_NOFOLLOW is not defined on Windows, which will break compilation of sciontool on Windows platforms.
Furthermore, O_NOFOLLOW is redundant here because os.O_EXCL combined with os.O_CREATE already guarantees that the file creation will fail with EEXIST if the path is a symbolic link (even a dangling one). Removing syscall.O_NOFOLLOW fixes cross-platform compilation while maintaining the same security guarantees.
| f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_EXCL|syscall.O_NOFOLLOW, 0600) | |
| f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0600) |
| if s.shutdownToken == "" || r.Header.Get("X-Scion-Shutdown-Token") != s.shutdownToken { | ||
| http.Error(w, "Forbidden", http.StatusForbidden) | ||
| return | ||
| } |
There was a problem hiding this comment.
The shutdown token is compared using standard string comparison (!=), which is not constant-time and can leak token information via timing attacks.
Additionally, we should trim any leading/trailing whitespace from the incoming header token to be robust against trailing newlines (e.g., if the token file was read with a trailing newline).
Please use crypto/subtle.ConstantTimeCompare to compare the tokens securely. Note that you will need to import crypto/subtle.
| if s.shutdownToken == "" || r.Header.Get("X-Scion-Shutdown-Token") != s.shutdownToken { | |
| http.Error(w, "Forbidden", http.StatusForbidden) | |
| return | |
| } | |
| inputToken := strings.TrimSpace(r.Header.Get("X-Scion-Shutdown-Token")) | |
| if s.shutdownToken == "" || len(inputToken) != len(s.shutdownToken) || subtle.ConstantTimeCompare([]byte(inputToken), []byte(s.shutdownToken)) != 1 { | |
| http.Error(w, "Forbidden", http.StatusForbidden) | |
| return | |
| } |
| if _, err := f.WriteString(token + "\n"); err != nil { | ||
| _ = os.Remove(path) | ||
| return err | ||
| } |
There was a problem hiding this comment.
On Windows, attempting to delete an open file using os.Remove will fail with a sharing violation. To ensure cross-platform compatibility and robustness, explicitly close the file descriptor f before attempting to remove the file on write failure.
| if _, err := f.WriteString(token + "\n"); err != nil { | |
| _ = os.Remove(path) | |
| return err | |
| } | |
| if _, err := f.WriteString(token + "\n"); err != nil { | |
| f.Close() | |
| _ = os.Remove(path) | |
| return err | |
| } |
| func shutdownTokenPath(port int) string { | ||
| return filepath.Join(os.TempDir(), fmt.Sprintf("scion-metadata-shutdown-%d.token", port)) | ||
| } |
There was a problem hiding this comment.
Using os.TempDir() directly can expose the application to local Denial of Service (DoS) attacks on multi-user systems. Since /tmp is world-writable, any local user can pre-create the predictable token file (e.g., /tmp/scion-metadata-shutdown-<port>.token) owned by themselves, which will cause os.Remove and the subsequent server startup to fail with Permission denied.
Using os.UserRuntimeDir() (which typically points to /run/user/<uid>) is much more secure as it is only accessible by the current user. We can fall back to os.TempDir() if it is not available.
func shutdownTokenPath(port int) string {
dir := os.TempDir()
if runtimeDir, err := os.UserRuntimeDir(); err == nil {
dir = runtimeDir
}
return filepath.Join(dir, fmt.Sprintf("scion-metadata-shutdown-%d.token", port))
}
Fixes #<issue_number_goes_here>