Skip to content

Conversation

@chansikpark
Copy link
Contributor

@chansikpark chansikpark commented Oct 30, 2025

It appears that #2046 introduced a bit of an uncommonly manifesting bug/quirk where overly long queries/requests would hang until either the server or client times out, at which point the 414 error code returns at the last moment. It looks like some unnecessary logic from older code was retained when the checks were reordered. I've modified two tests that were passing but taking 5000+ ms. They now must pass in < 100 ms.

Please advise.

Before

Before

After

image
test$make all

...

[----------] Global test environment tear-down
[==========] 457 tests from 103 test suites ran. (255058 ms total)
[  PASSED  ] 457 tests.

  YOU HAVE 2 DISABLED TESTS

@yhirose
Copy link
Owner

yhirose commented Nov 2, 2025

@chansikpark thanks for the pull request! Could you take a look at the following review from Copilot?

Practical recommendations (changes to make it safe)

Always close the connection: If the current implementation keeps the connection alive after returning a 414 response, the change is insufficient and can be dangerous. Make the response explicitly include Connection: close (or set the internal flag close_connection = true) so that the socket is closed after sending the response. Doing this prevents side effects without having to read and discard all headers.

Extend the tests: Add assertions to the tests that verify the connection is closed (via the Connection header or socket state) so the change does not accidentally leave connections open. It’s also useful to add a test where the client continues sending a large body and the server immediately returns 414 and closes the socket.

Conclusion (brief)

The current change is in the right direction to “stop the unnecessary read that delayed the response and return 414 faster.” However, to operate safely with keep-alive enabled, you should either explicitly close the connection after the response (Connection: close / close_connection) or add logic to properly discard unread data. If the implementation does not already guarantee the latter, I strongly recommend modifying it to explicitly close the connection.

@chansikpark
Copy link
Contributor Author

chansikpark commented Nov 2, 2025

Always close the connection

It's generally unclear to me specifically how a dangling connection can be dangerous, but this sounds pretty reasonable as far as I can tell. I couldn't find anything in the specs that mandate/recommend either way.

From 2.4 Conformance.ErrorHandling:

Unless noted otherwise, a recipient MAY attempt to recover a usable protocol element from an invalid construct. HTTP does not define specific error handling mechanisms except when they have a direct impact on security, since different applications of the protocol require different error handling strategies.

So I've made all client/server errors close the connection here:

diff --git a/httplib.h b/httplib.h
--- a/httplib.h
+++ b/httplib.h
@@ -7692,7 +7692,8 @@ inline bool Server::write_response_core(Stream &strm, bool close_connection,
   if (need_apply_ranges) { apply_ranges(req, res, content_type, boundary); }
 
   // Prepare additional headers
-  if (close_connection || req.get_header_value("Connection") == "close") {
+  if (close_connection || req.get_header_value("Connection") == "close" ||
+      400 <= res.status) { // Don't leave connections open after errors
     res.set_header("Connection", "close");
   } else {
     std::string s = "timeout=";

Anticipating changes in multiple codepaths, I started by modifying 4 different erroring tests to set Keep Alive on and check whether the right thing happens afterwards. Then I realized I can just change one place in write_response_core, in which case adding a single new test is kind of enough. If I want to be particularly thorough, I could modify all applicable tests to ensure each error code closes the connection, but I'm under the impression that this is always the case unless there's an explicit cli_.set_keep_alive(true) before the request. I defer to your judgement in this matter.

Notably, total test time goes down by 20% (This may be due to varying availability of external servers):

Before Change (with modified tests)

[----------] Global test environment tear-down
[==========] 458 tests from 103 test suites ran. (271157 ms total)
[  PASSED  ] 453 tests.
[  FAILED  ] 5 tests, listed below:
[  FAILED  ] ServerTest.TooLongRequest
[  FAILED  ] ServerTest.LongQueryValue
[  FAILED  ] ServerTest.HeaderCountExceedsLimit
[  FAILED  ] ServerTest.HeaderCountSecurityTest
[  FAILED  ] ServerTest.BadRequestLineCancelsKeepAlive

After Change

[----------] Global test environment tear-down
[==========] 458 tests from 103 test suites ran. (217041 ms total)
[  PASSED  ] 458 tests.

@chansikpark chansikpark marked this pull request as draft November 3, 2025 03:11
@yhirose yhirose marked this pull request as ready for review November 3, 2025 03:22
@yhirose yhirose merged commit 4b2b851 into yhirose:master Nov 3, 2025
10 checks passed
@yhirose
Copy link
Owner

yhirose commented Nov 3, 2025

@chansikpark this latest update looks very good to me and I have just merged as it is. Thanks for the fine improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants