Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak in web request due to cyclic reference #10548

Closed
1 task done
ShamariYoti opened this issue Mar 14, 2025 · 21 comments · Fixed by #10569
Closed
1 task done

Memory Leak in web request due to cyclic reference #10548

ShamariYoti opened this issue Mar 14, 2025 · 21 comments · Fixed by #10569
Labels
bug reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR

Comments

@ShamariYoti
Copy link

Describe the bug

We have a service that accepts images and does a bunch of things with them and I kept seeing the memory usage grow and grow, after adding in tracemalloc I could see that the memory used in web_request.py kept growing the most, at one point I saw it grow to around 21.2MiB. I then decided to create a small aiohttp server that accepts images to see if I could replicate what I'm seeing in my production service and I could. I then started to use objgraph to see if I could print out an object graph and it seems like there's a cyclic reference to Request.

Could this cyclic dependency be the reason why i'm seeing the memory grow and not get cleaned up ? Any help would be appreciated, thanks.

Code:

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time

import objgraph

gc.set_debug(gc.DEBUG_LEAK)


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f'{obj_name}')
        if obj_name in ('Request',):
            print('Request not collected!')
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def hanlder(request):
    print(f'read request')
    req = await request.json()
    return web.Response(text="Request has been receieved")

async def on_startup(app) -> None:
    # asyncio.create_task(show_memory())
    asyncio.create_task(show_objgraph())

async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f'Garbage objects: {get_garbage()}')

async def show_memory():
    print('start tracing memory')

    tracemalloc.start(25)
    start = tracemalloc.take_snapshot()

    snapshot_num = 1
    while True:
        await asyncio.sleep(20)
        current = tracemalloc.take_snapshot()
        # compare current snapshot to starting snapshot
        stats = current.compare_to(start, 'filename')

        print('top diffs since start')
        # print top diffs: current snapshot - start snapshot
        for i, stat in enumerate(stats[:15], 1):
            print(f'top diffs: {i}, {str(stat)}')

        traces = current.statistics('traceback')
        for stat in traces[:2]:
            for line in stat.traceback.format():
                print(line)

        snapshot_num = snapshot_num + 1


my_web_app = web.Application()
my_web_app.router.add_route('POST', '/image', hanlder)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app)

Image

To Reproduce

Can run the test server above and request the service with any image can even un comment the call to asyncio.create_task(show_memory()) to view the stats.

Expected behavior

Request should be closed and garbage collected after the request has been served.

Logs/tracebacks

View the output for `Request` objects created by objgraph.

Python Version

3.10.12

aiohttp Version

3.10.11

multidict Version

6.1.0

propcache Version

0.2.1

yarl Version

1.18.3

OS

Ubuntu 22.04.5 LTS

Related component

Server

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@bdraco
Copy link
Member

bdraco commented Mar 14, 2025

Can you reproduce this on 3.11.x?

We aren't shipping new builds of 3.10 anymore.

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Doesn't seem to be reproducible on master

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Appears to be reproducible on 3.11

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Reproducible on MatchInfoError

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Image

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

confirmed reproducible on 3.11 but not on master

@bdraco bdraco added the reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR label Mar 16, 2025
@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

I have a feeling #3462 likely fixed this

@bdraco bdraco changed the title Memory Leak in web request due to cyclic reference Memory Leak in web request due to cyclic reference (3.x, already fixed on master) Mar 16, 2025
@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

To reproduce you need to generate a 404

@bdraco bdraco changed the title Memory Leak in web request due to cyclic reference (3.x, already fixed on master) Memory Leak in web request due to cyclic reference Mar 16, 2025
@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Issue exists on master if you access http://0.0.0.0:8080/image with GET and generate a Method Not Allowed

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

While I can confirm reproduction, I have had no luck figuring out how to fix this as it seems its referenced in more than one place

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

There is another leak on the client side I stumbled upon while testing

Can be reproduced with

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time

import objgraph
import aiohttp

gc.set_debug(gc.DEBUG_LEAK)


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f"{obj_name}")
        if obj_name in ("Request",):
            print("Request not collected!")
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def handler(request: web.Request) -> web.Response:
    print(f"read request")
    await request.json()
    return web.Response(text="Request has been received")


async def make_request():
    async with aiohttp.ClientSession() as session:
        async with session.get("http://localhost:8080/image") as resp:
            print(await resp.text())


async def on_startup(app) -> None:
    # asyncio.create_task(show_memory())
    asyncio.create_task(show_objgraph())
    asyncio.create_task(make_request())


async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f"Garbage objects: {get_garbage()}")


async def show_memory():
    print("start tracing memory")

    tracemalloc.start(25)
    start = tracemalloc.take_snapshot()

    snapshot_num = 1
    while True:
        await asyncio.sleep(20)
        current = tracemalloc.take_snapshot()
        # compare current snapshot to starting snapshot
        stats = current.compare_to(start, "filename")

        print("top diffs since start")
        # print top diffs: current snapshot - start snapshot
        for i, stat in enumerate(stats[:15], 1):
            print(f"top diffs: {i}, {str(stat)}")

        traces = current.statistics("traceback")
        for stat in traces[:2]:
            for line in stat.traceback.format():
                print(line)

        snapshot_num = snapshot_num + 1


my_web_app = web.Application()
my_web_app.router.add_route("GET", "/image", handler)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app)

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

I thought it was on the client side but it is the server side. Became obvious when I added a name to the task

Image

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

Image

MethodNotAllowed is a bit worse

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

We don't clear request._task on exception so the traceback in start references it

bdraco added a commit that referenced this issue Mar 16, 2025
This is a partial fix for #10548

There is still another case for SystemRoutes that needs to be addressed
bdraco added a commit that referenced this issue Mar 16, 2025
This is a partial fix for #10548

There is still another case for SystemRoutes that needs to be addressed
@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

another reproducer option

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time

import objgraph
import aiohttp
from aiohttp.test_utils import get_unused_port_socket

gc.set_debug(gc.DEBUG_LEAK)

sock = get_unused_port_socket("127.0.0.1")
port = sock.getsockname()[1]


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f"{obj_name}")
        if obj_name in ("Request",):
            print("Request not collected!")
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def handler(request: web.Request) -> web.Response:
    print(f"read request")
    await request.json()
    return web.Response(text="Request has been received")


async def make_request():
    async with aiohttp.ClientSession() as session:
        async with session.get(f"http://localhost:{port}/image") as resp:
            print(await resp.text())


async def on_startup(app) -> None:
    # asyncio.create_task(show_memory())
    asyncio.create_task(show_objgraph())
    asyncio.create_task(make_request())


async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f"Garbage objects: {get_garbage()}")


async def show_memory():
    print("start tracing memory")

    tracemalloc.start(25)
    start = tracemalloc.take_snapshot()

    snapshot_num = 1
    while True:
        await asyncio.sleep(20)
        current = tracemalloc.take_snapshot()
        # compare current snapshot to starting snapshot
        stats = current.compare_to(start, "filename")

        print("top diffs since start")
        # print top diffs: current snapshot - start snapshot
        for i, stat in enumerate(stats[:15], 1):
            print(f"top diffs: {i}, {str(stat)}")

        traces = current.statistics("traceback")
        for stat in traces[:2]:
            for line in stat.traceback.format():
                print(line)

        snapshot_num = snapshot_num + 1


my_web_app = web.Application()
my_web_app.router.add_route("GET", "/image", handler)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app, sock=sock)

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

smaller repro

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time

import objgraph
import aiohttp
from aiohttp.test_utils import get_unused_port_socket

gc.set_debug(gc.DEBUG_LEAK)

sock = get_unused_port_socket("127.0.0.1")
port = sock.getsockname()[1]


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f"{obj_name}")
        if obj_name in ("Request",):
            print("Request not collected!")
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def handler(request: web.Request) -> web.Response:
    print(f"read request")
    await request.json()
    return web.Response(text="Request has been received")


async def make_request():
    async with aiohttp.ClientSession() as session:
        async with session.get(f"http://localhost:{port}/image") as resp:
            print(await resp.text())


async def on_startup(app) -> None:
    # asyncio.create_task(show_memory())
    asyncio.create_task(show_objgraph())
    asyncio.create_task(make_request())


async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f"Garbage objects: {get_garbage()}")


my_web_app = web.Application()
my_web_app.router.add_route("GET", "/image", handler)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app, sock=sock)

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

On the client side if we try to make a request to a port that isn't open and we get ConnectionRefusedError we have another leak

Image

reproducer

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time
from contextlib import suppress
import objgraph
import aiohttp
from aiohttp.test_utils import get_unused_port_socket

gc.set_debug(gc.DEBUG_LEAK)

sock = get_unused_port_socket("127.0.0.1")
port = sock.getsockname()[1]


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f"{obj_name}")
        if obj_name in ("ConnectionRefusedError",):
            print("ConnectionRefusedError not collected!")
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def handler(request: web.Request) -> web.Response:
    print(f"read request")
    await request.json()
    return web.Response(text="Request has been received")


async def make_request():
    for _ in range(5):
        async with aiohttp.ClientSession() as session:
            with suppress(aiohttp.ClientError):
                async with session.get(f"http://localhost:{port+1}/image") as resp:
                    print(["response:",await resp.text()])


async def on_startup(app) -> None:
    asyncio.create_task(make_request())
    asyncio.create_task(show_objgraph())


async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f"Garbage objects: {get_garbage()}")


my_web_app = web.Application()
my_web_app.router.add_route("GET", "/image", handler)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app, sock=sock)

@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

simpler version that show the ClientRequest is leaking from the exception traceback

from aiohttp import web
import gc
import asyncio
import tracemalloc
from time import time
from contextlib import suppress
import objgraph
import aiohttp
from aiohttp.test_utils import get_unused_port_socket

gc.set_debug(gc.DEBUG_LEAK)

sock = get_unused_port_socket("127.0.0.1")
port = sock.getsockname()[1]


def get_garbage():
    result = []
    gc.collect()
    for obj in gc.garbage:
        obj_name = type(obj).__name__
        result.append(f"{obj_name}")
        if obj_name in ("ClientRequest",):
            print("ClientRequest not collected!")
            objgraph.show_backrefs(
                obj,
                max_depth=30,
                too_many=50,
                filename=f"/tmp/{int(time() * 1000)}err_referrers.png",
            )

    return result


async def handler(request: web.Request) -> web.Response:
    print(f"read request")
    await request.json()
    return web.Response(text="Request has been received")


async def make_request():
    for _ in range(5):
        async with aiohttp.ClientSession() as session:
            with suppress(aiohttp.ClientError):
                async with session.get(f"http://localhost:{port+1}/image") as resp:
                    print(["response:",await resp.text()])
    await asyncio.sleep(0.5)
    print(f"Garbage objects: {get_garbage()}")

async def on_startup(app) -> None:
    asyncio.create_task(make_request())


async def show_objgraph():
    while True:
        await asyncio.sleep(10)
        gc.collect()
        print(f"Garbage objects: {get_garbage()}")


my_web_app = web.Application()
my_web_app.router.add_route("GET", "/image", handler)
my_web_app.on_startup.append(on_startup)
web.run_app(my_web_app, sock=sock)

bdraco added a commit that referenced this issue Mar 16, 2025
…#10569)

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a partial fix for #10548

- There is still another case for `SystemRoute`s that needs to be
addressed. No reproducer available yet.
- There is also another case on the client side on connection refused
that still needs to be addressed
#10548 (comment)

## Are there changes in behavior for the user?

fixes memory leak

## Is it a substantial burden for the maintainers to support this?
no
patchback bot pushed a commit that referenced this issue Mar 16, 2025
…#10569)

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a partial fix for #10548

- There is still another case for `SystemRoute`s that needs to be
addressed. No reproducer available yet.
- There is also another case on the client side on connection refused
that still needs to be addressed
#10548 (comment)

## Are there changes in behavior for the user?

fixes memory leak

## Is it a substantial burden for the maintainers to support this?
no

(cherry picked from commit dfbf782)
patchback bot pushed a commit that referenced this issue Mar 16, 2025
…#10569)

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a partial fix for #10548

- There is still another case for `SystemRoute`s that needs to be
addressed. No reproducer available yet.
- There is also another case on the client side on connection refused
that still needs to be addressed
#10548 (comment)

## Are there changes in behavior for the user?

fixes memory leak

## Is it a substantial burden for the maintainers to support this?
no

(cherry picked from commit dfbf782)
@bdraco
Copy link
Member

bdraco commented Mar 16, 2025

looks like the other one I found isn't actually a problem it does become unreachable get cleaned up.

closing this via #10571

@bdraco bdraco closed this as completed Mar 16, 2025
bdraco added a commit that referenced this issue Mar 16, 2025
…e is an exception handling a request (#10571)

**This is a backport of PR #10569 as merged into master
(dfbf782).**

<!-- Thank you for your contribution! -->

## What do these changes do?

This is a partial fix for #10548

- There is still another case for `SystemRoute`s that needs to be
addressed. No reproducer available yet.
- There is also another case on the client side on connection refused
that still needs to be addressed
#10548 (comment)

## Are there changes in behavior for the user?

fixes memory leak

## Is it a substantial burden for the maintainers to support this?
no

Co-authored-by: J. Nick Koston <[email protected]>
bdraco added a commit that referenced this issue Mar 16, 2025
…e is an exception handling a request (#10572)

**This is a backport of PR #10569 as merged into master
(dfbf782).**

<!-- Thank you for your contribution! -->

## What do these changes do?

fixes #10548

## Are there changes in behavior for the user?

fixes a potential memory leak

## Is it a substantial burden for the maintainers to support this?
no

Co-authored-by: J. Nick Koston <[email protected]>
@ShamariYoti
Copy link
Author

Thanks for taking a look at this, I will re test on my side. I can download this via the latest patch in 3.11 ? @bdraco

@bdraco
Copy link
Member

bdraco commented Mar 17, 2025

yes 3.11.14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug reproducer: present This PR or issue contains code, which reproduce the problem described or clearly understandable STR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants