Hey Cosimo,
Thanks a lot for your detailed post about deploying models with gunicorn+fastapi. I'm using a similar approach, and using max_requests does indeed help a lot.
I was wondering if you had changed your approach, or learned anything relevant since you wrote that, and could share it :)
Best,
Dylan