Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client support for compressed transfer encoding #4435

Open
JustAnotherArchivist opened this issue Dec 11, 2019 · 5 comments
Open

Client support for compressed transfer encoding #4435

JustAnotherArchivist opened this issue Dec 11, 2019 · 5 comments

Comments

@JustAnotherArchivist
Copy link
Contributor

JustAnotherArchivist commented Dec 11, 2019

Long story short

aiohttp does not support any transfer encoding other than chunking. Responses from servers using e.g. Transfer-Encoding: gzip result in the compressed payload. Worse yet, those applying both compression and chunking result in the raw payload with chunking still intact (but only when using the C parser!).

Expected behaviour

Decompressed payload.

Actual behaviour

Compressed or compressed + chunked payload.

Steps to reproduce

A simple server and client illustrating this issue can be found in this gist. I wrote a single test client and server for both this issue and #4436; the first line is irrelevant to this issue and not included in the output below.

Expected output:

> python3 client.py | tail -n+2
b'Test'
b'Test'

Actual output:

> python3 client.py | tail -n+2
b'\x1f\x8b\x08\x00\x82Y\xf0]\x02\xff\x0bI-.\x01\x002\xd1Mx\x04\x00\x00\x00'
b'5\r\n\x1f\x8b\x08\x00\x82\r\n13\r\nY\xf0]\x02\xff\x0bI-.\x01\x002\xd1Mx\x04\x00\x00\x00\r\n0\r\n\r\n'

However, when running it with the pure-Python parser, the chunked TE gets handled in the second case, and the output is:

> AIOHTTP_NO_EXTENSIONS=1 python3 client.py | tail -n+2
b'\x1f\x8b\x08\x00|\\\xf0]\x02\xff\x0bI-.\x01\x002\xd1Mx\x04\x00\x00\x00'
b'\x1f\x8b\x08\x00|\\\xf0]\x02\xff\x0bI-.\x01\x002\xd1Mx\x04\x00\x00\x00'

Your environment

I tested this with aiohttp 2.3.10 and Python 3.6.9 on Debian, but based on the current aiohttp code, the behaviour should still be the same on the current versions.

@JustAnotherArchivist
Copy link
Contributor Author

I should add that a number of other tools do not support transfer encodings other than chunked. If this is a conscious decision for aiohttp, that is okay, but it should be documented. For <other>, chunked, the chunking should still be handled by aiohttp in my opinion, and the two implementations (C/Python) should agree with each other.

Sidenote: the docs talk about "The gzip and deflate transfer-encodings are automatically decoded for you." and similar, but this is about Content-Encoding, not Transfer-Encoding!

@asvetlov
Copy link
Member

Would you fix this?

@0xicl33n
Copy link

0xicl33n commented Dec 15, 2019

My team and i are having a similar problem where we are unable to send bytes using aiohttp but we can do it via requests. This issue might be the cause of our problem as well, eg(pseudocode) we have

head = {

    'Accept-Encoding': 'gzip'
}

with aiohttp.ClientSession() as session: 
    await session.post('example.com', headers=head, data={'some_key':b'some bytes'})

and it results with an invalid request

{'error': 'invalid_request', 'error_description': 'The provided some_key is invalid'}

@asvetlov
Copy link
Member

@0xicl33n Accept-Encoding: gzip tells that the server can return a gzipped answer.
The header is not related to request body encoding.
I guess your issue is not related; most likely you've missed what data you send and what data format the server expects.

@JustAnotherArchivist
Copy link
Contributor Author

@asvetlov Well, the question is how this should actually be fixed. I think it is desirable to support compressed transfer encoding, even though it's not used very frequently in practice and not even supported by wget, requests, and h11 as far as I can see. (curl supports it.)

If we do add support, the next question is how to do this with the C parser. Transfer encoding is handled entirely inside http-parser, and I'm not sure it'd be a good idea to move this to the Python layer for performance reasons. So the support for compressed TE would also/first have to happen upstream.

As briefly noted in my comment above, there is also another issue here involving the Content-Encoding header, and I've now filed this separately as #4462.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants