Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp.ClientSession() does not send TCP FIN #4685

Open
vmisupov opened this issue Apr 11, 2020 · 7 comments
Open

aiohttp.ClientSession() does not send TCP FIN #4685

vmisupov opened this issue Apr 11, 2020 · 7 comments
Labels

Comments

@vmisupov
Copy link

vmisupov commented Apr 11, 2020

🐞 Describe the bug
I tried to create async web client and found that my OSX 10.14 netstat show many connections in TIME_WAIT-state. I started Wireshark and saw that TCP sessions donot finish correctly.
Client should send TCP-FIN and get TCP-ACK from server, but client does not send.

💡 To Reproduce
Start python code and when script finish - see network connections to you local web-server(192.168.1.1 in my case) in netstat output.

xeon$ netstat -na -p tcp | grep 192.168.1.1
tcp4 0 0 192.168.107.29.58842 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58843 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58844 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58845 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58846 192.168.1.1.80 TIME_WAIT


import asyncio
import random
import time
import aiohttp

STATIC_ADDR='http://192.168.1.1/'

# make one HTTP request and print result to console
async def request_webserv(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            print(resp.status)
            print(await resp.text())

# make worket, which get item from queue and make request_webserv
async def worker(name, queue):
    while True:
        # Get a "work item" out of the queue.
        url = await queue.get()
        await request_webserv(url)
        queue.task_done()

# create queue and fill it with 20 equal links, then start 3 workers 
# and they perform 20 HTTP requests
async def main():
    queue = asyncio.Queue(maxsize=100)
    total_time = 0

    for _ in range(20):
        queue.put_nowait(STATIC_ADDR)

    # Create worker tasks to process the queue concurrently.
    tasks = []
    for i in range(5):
        task = asyncio.create_task(worker(f'worker-{i}', queue))
        tasks.append(task)

    started_at = time.monotonic()
    await queue.join()
    total_spent_for = time.monotonic() - started_at

    # Cancel our worker tasks.
    for task in tasks:
        task.cancel()
    # Wait until all worker tasks are cancelled.
    await asyncio.gather(*tasks, return_exceptions=True)

asyncio.run(main())

For example:

  1. Have certain environment
  2. Run given code snippet in a certain way
  3. See some behavior described

Add these steps below this comment: -->

💡 Expected behavior

I expect that

📋 Logs/tracebacks

📋 Your version of the Python

$ python --version
Python 3.7.6

📋 Your version of the aiohttp/yarl/multidict distributions

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.6.2
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: Nikolay Kim
Author-email: [email protected]
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: async-timeout, attrs, chardet, yarl, multidict
Required-by: 
$ python -m pip show multidict
Name: multidict
Version: 4.7.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: 
Required-by: yarl, aiohttp
$ python -m pip show yarl
Name: yarl
Version: 1.4.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: multidict, idna
Required-by: aiohttp

📋 Additional context

client library
@vmisupov vmisupov added the bug label Apr 11, 2020
@socketpair
Copy link
Contributor

socketpair commented Apr 11, 2020

In general, FIN is sent only when the last reference on a socket file descriptor is voluntarily closed by close(). In other conditions, RST is sent. Try gc.collect() at the end of the program. If it helps, then .close() is missing somewhere. If still not, then there is something heavier, like socket object leakage.

@vmisupov
Copy link
Author

vmisupov commented Apr 14, 2020

gc.collect() did not help..
Additionaly checked on Ubuntu 16.04 LTS - behaviour is the same.

netstat -an -p tcp | grep .80
tcp4       0      0  192.168.1.132.56201    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56198    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56197    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56199    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56200    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56204    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56203    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56202    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56205    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56207    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56206    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56209    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56208    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56210    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56213    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56212    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56214    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56215    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56216    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56217    192.168.1.1.80         TIME_WAIT 

@vmisupov
Copy link
Author

vmisupov commented Apr 14, 2020

In general, FIN is sent only when the last reference on a socket file descriptor is voluntarily closed by close(). In other conditions, RST is sent. Try gc.collect() at the end of the program. If it helps, then .close() is missing somewhere. If still not, then there is something heavier, like socket object leakage.

Проверил на Ubuntu, проверил что не работает gc.collect().
Возможно это поведение сессии так и задумано? Я посмотрел код ClientSession, он ссылается на class TCPConnector, у которого есть параметр force_close=False(который судя по описанию и отвечает за закрытие соединения после каждого коннекта).

@tonyroberts
Copy link

I was looking into what I think is the same issue and I noticed something. In asyncio.proactor_events._ProactorBasePipeTransport, socket.shutdown() is called in _call_connection_lost, but in asyncio.selector_events._ProactorBasePipeTransport._SelectorTransport it is closed without shutdown() being called first.

Could this explain why the aiohttp client is not sending FIN on linux/mac, even though the connection is closed? I think that close just sends an RST packet, whereas shutdown sends a FIN packet.

_ProactorBasePipeTransport

    def _call_connection_lost(self, exc):
        if self._called_connection_lost:
            return
        try:
            self._protocol.connection_lost(exc)
        finally:
            # XXX If there is a pending overlapped read on the other
            # end then it may fail with ERROR_NETNAME_DELETED if we
            # just close our end.  First calling shutdown() seems to
            # cure it, but maybe using DisconnectEx() would be better.
            if hasattr(self._sock, 'shutdown') and self._sock.fileno() != -1:
                self._sock.shutdown(socket.SHUT_RDWR)
            self._sock.close()
            self._sock = None
            server = self._server
            if server is not None:
                server._detach()
                self._server = None
            self._called_connection_lost = True

_SelectorTransport

    def _call_connection_lost(self, exc):
        try:
            if self._protocol_connected:
                self._protocol.connection_lost(exc)
        finally:
            self._sock.close()
            self._sock = None
            self._protocol = None
            self._loop = None
            server = self._server
            if server is not None:
                server._detach()
                self._server = None

I've tried to fix the problem I've been having by passing a custom connector to aiohttp.ClientSession that always calls shutdown on the underlying socket before it's closed. That seems to have worked in my case and I now have no connections left in the TIME_WAIT state, but I'm not really familiar enough with the internals of aiohttp to know exactly where a similar fix should be applied, or whether it's something that needs to be fixed in asyncio... so I thought I'd leave this comment in case it might be of any help.

@x0day
Copy link

x0day commented Aug 15, 2024

@bdraco could you check this issue?

I encountered some cleanup issues when using aiohttp and uvloop in some high-concurrency scenarios. also I do close job after response like this.

session = aiohttp.ClientSession()
resp = session.get(...)
try:
    ...
finally:
   resp.release()
   await resp.wait_for_close()
   await session.close()

image

When I started to troubleshoot the problem, I found that there were a lot of TIME_WAIT connections on the server, just like mentioned above. I am not sure if the problem I encountered is related to the above mentioned problem, and the above problem can still be reproduced now.

@bdraco
Copy link
Member

bdraco commented Aug 15, 2024

@x0day

It shouldn't be possible for file descriptors to get reused at aiohttp's level there is a bug in cpython or uvloop. That assumes nothing in aiohttp is bypassing the normal socket APIs (which I have not found)

We will need a minimum reproducer that could be turned into a test for this to better understand the issue.

This seems like a different issue.

@Dreamsorcerer
Copy link
Member

I can see the TIME_WAIT as described above, but adding sock.shutdown() into _call_connection_lost() doesn't seem to resolve it. I'm not really familiar enough with the socket code to know how it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants