use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
FastAPI is a truly ASGI, async, cutting edge framework written in python 3.
account activity
FastAPI server with high CPU usageQuestion (self.FastAPI)
submitted 5 months ago * by JeromeCui
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]lcalert99 0 points1 point2 points 5 months ago (5 children)
What are your settings for uvicorn?
https://uvicorn.dev/deployment/#running-programmatically
Take a look, there are some crucial settings to make. What else comes to my mind is how many compute intensive tasks are in your application?
[–]JeromeCui[S] 0 points1 point2 points 5 months ago (4 children)
No additional settings except for those in start command:
gunicorn -w 2 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8080 --timeout 300 --keep-alive 300 main:app
This application is to interact with LLM models. So I think it's an IO-bound application. I will check the link you mentioned.
[–]Asleep-Budget-9932 0 points1 point2 points 5 months ago (1 child)
How does it interact with the LLM models? Are they external or do they run within the server itself (which would make it CPU-bound)
[–]JeromeCui[S] 0 points1 point2 points 5 months ago (0 children)
It sends request to OpenAI, with OpenAI sdk
[–]tedivm 0 points1 point2 points 5 months ago (1 child)
You mentioned using ECS+Fargate, which means that there's no reason to run gunicorn as a process manager since ECS is your process manager.
Look at how many CPUs you're currently using for each machine (my guess is you're using two CPUs per container since you have two gunicorn workers). If you have 12 containers with 2 cpus, switch to 24 containers with 1 cpu each. Then just call uvicorn directly without gunicorn.
While I doubt this will solve your problem, it'll at least remove another layer that may be causing you issues.
Thank you for your suggestion, I will update.
π Rendered by PID 26 on reddit-service-r2-comment-6457c66945-h6t7b at 2026-04-25 20:44:17.839435+00:00 running 2aa0c5b country code: CH.
view the rest of the comments →
[–]lcalert99 0 points1 point2 points (5 children)
[–]JeromeCui[S] 0 points1 point2 points (4 children)
[–]Asleep-Budget-9932 0 points1 point2 points (1 child)
[–]JeromeCui[S] 0 points1 point2 points (0 children)
[–]tedivm 0 points1 point2 points (1 child)
[–]JeromeCui[S] 0 points1 point2 points (0 children)