Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing a missing CRLF at the end of a chunk in a malformed chunked encoding response #10355

Open
1 task done
xdegaye opened this issue Jan 23, 2025 · 3 comments
Open
1 task done
Labels

Comments

@xdegaye
Copy link

xdegaye commented Jan 23, 2025

Describe the bug

A missing CRLF at the end of a chunk may happen for example when the parsed chunk size is wrong.

Parsing the end of a chunk is done by HttpPayloadParser.feed_data(), see this part of the code.
When CRLF is missing:

  • The HttpPayloadParser state remains ChunkState.PARSE_CHUNKED_CHUNK_EOF until there are no more chunks to parse
    since the next chunks processed by the following calls to feed_data() are prefixed with self._chunk_tail so that
    this code parses always the same start of sequence of bytes (which does not start with CRLF).
  • All the next chunks are appended to self._chunk_tail.
  • If the chunked body is large enough, such as an audio stream for example, the stream seems to hang.

To Reproduce

In a well formed chunked encoded HTTP response, replace the CRLF at the end of a chunk with two bytes that are not CRLF.

Expected behavior

A fix could be:

  • Change also the parser state to ChunkState.PARSE_CHUNKED_SIZE when the CRLF is missing.
  • Alternatively raise an exception.

Logs/tracebacks

-

Python Version

-

aiohttp Version

commit d3dc087b8e9aa665d47045550e5d9f2eddf8f512 (HEAD -> master, origin/master, origin/HEAD)
Date:   Wed Jan 22 05:53:25 2025 -0800

multidict Version

-

propcache Version

-

yarl Version

-

OS

Related component

Server

Additional context

This issue is the result of a code review.

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@xdegaye xdegaye added the bug label Jan 23, 2025
@xdegaye
Copy link
Author

xdegaye commented Jan 23, 2025

A fix could be the following patch:

diff --git a/aiohttp/http_parser.py b/aiohttp/http_parser.py
index d44f4cb1..8d7049ed 100644
--- a/aiohttp/http_parser.py
+++ b/aiohttp/http_parser.py
@@ -880,6 +880,12 @@ class HttpPayloadParser:
                         chunk = chunk[len(SEP) :]
                         self._chunk = ChunkState.PARSE_CHUNKED_SIZE
                     else:
+                        if chunk and chunk != SEP[:1]:
+                            exc = BadHttpMessage("Missing CRLF at chunk end")
+                            set_exception(self.payload, exc)
+                            raise exc
+                        # This is an empty chunk or the chunk is equal to CR.
+                        # Get the CRLF or the missing LF in the next chunk.
                         self._chunk_tail = chunk
                         return False, b""

@Dreamsorcerer
Copy link
Member

That fix may be OK. If you'd like to open a PR, please create a regression test first in test_http_parser.py (making sure to use the response: HttpResponseParser fixture).

@xdegaye
Copy link
Author

xdegaye commented Jan 24, 2025

I will give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants