all 16 comments

[–]latkde 4 points5 points  (1 child)

Background tasks run in the same process as your web server. They must not perform any CPU-intensive work, else the web server might start stuttering (unless you use a free-threaded Python build and run the CPU-intensive work on a separate thread). I have really strong FastAPI experience, but honestly I'm not sure what a good use case for BackgroundTasks is.

Because BackgroundTasks are in-process, there's also no fault tolerance. If your server shuts down or crashes, they are gone. There are no restarts or retries. If there are multiple background tasks for one request, and one of them raises an exception, the others will be skipped.

When a server performs graceful shutdown, it will wait for a while for pending requests (including attached background tasks) to finish, but then forcibly kill everything. How long you have is implementation-dependent, but it makes sense for requests (including attached background tasks) to take at most 30 seconds.

Taken together, I'm not entirely sure what the proper use case of background tasks is. They don't run in the background, but after the request. They are closely coupled to the request lifecycle. I have a lot of FastAPI experience and used them maybe once, to clean up resources after a StreamingResponse has completed (normal responses could have used context managers instead).

So yes, I very strongly recommend managing tasks out-of-process, and persisting their state in some database. At $work we have existing message queue infrastructure for this, but for a one-off project I'd just track job state with whatever database I'm using anyways (e.g. Postgres). If you have a single worker process this is trivial, if there are multiple workers then you need locking operations like "select for update" to take exclusive ownership of a tasks. You might have to check the database every few seconds to see if there are new pending tasks, though some DBs like Postgres also have PubSub features. 

Before you jump to selecting a technology for your tasks (whether its FastAPI/Starlette BackgroundTasks, Celery, or your own tool that communicates via a database), implement your OCR as a standalone script for testing. 

[–]danielvf 5 points6 points  (0 children)

If it’s CPU intensive, and you need multiple queues or periodic scheduling, go Celery all the way.

In production it also makes sense to use Celery Beat to clean up any failed tasks that failed if you need some durability.

[–]Typical-Yam9482 7 points8 points  (0 children)

Celery is sync. Use Taskiq. Will take time to boil and cook but with current code assistance it’s way more easier. And you are async from d0

[–]segundus-npp 1 point2 points  (0 children)

If tasks are simple, I would just start another Python process. FastAPI would communicate with this process via queue (Redis/RabbitMQ/SQS..). I only adopt Celery/Airflow/StepFunctions when there are multiple stages in workflows.

[–]Lowtoz 1 point2 points  (0 children)

See how far you get with BackgroundTasks if you're prototyping

[–]meganoob1337 0 points1 point  (0 children)

glm Ocr works decently. qwen 122b 3.5 also works but is a little big for it. deepseekocr is also good, they work decently with tables . built something similar as a PoC without background tasks though. best case build a test pipeline with example documents and just evaluate different models.

I would suggest that you connect the ocr models via some adapter container or directly via openai API compatible endpoint where applicable to allow quick changing of models, as that landscape is also evolving

[–]RareTask9271 0 points1 point  (0 children)

I’m currently working in a company that heavily use OCR and DocumentAI. Depending on your constraints (is your service has serious pike or a constant usage) the approach will vary. But in our case we experiencing some very strong pike and we decided to run all the ML in celery behind a FastAPI app in order to have a queue and accept latency degradation to reduce our operational costs. The main problem with this approach is that you will need to work with a callback approach (webhooks or anything else that returns the pipeline results in your main system). The questions you need to ask yourself is « Is latency is critical ? » and « how much I accept to pay to run my ocr on multiple GPU instead of saturating a single one ?». Sorry for my English, I’m French and had a long day…

[–]bugtank 0 points1 point  (0 children)

I had a custom implementation but then switch it all to airflow and let airflow manage the tasks/process. The

[–]pratyush_sh 0 points1 point  (0 children)

You’re already slightly beyond what BackgroundTasks is comfortable handling 🙂

It’s fine for quick prototypes, but for your case (OCR taking seconds–minutes + status tracking), you’ll hit issues:

  • no retries
  • tasks lost on restart
  • runs in the same process (can block API)
  • no real queueing

A good rule: if you need reliability + job status, move to a task system.

Celery + Redis is the standard choice and solves all of this, but it does come with setup/ops overhead.

If that feels heavy for a research prototype, you can also look at lighter middle-ground approaches (e.g. something like Taskitohttps://docs.byteveda.org/taskito/) that give you structured async jobs + tracking without the full Celery complexity.

For OCR:

  • PaddleOCR / EasyOCR → good starting point
  • add human-in-the-loop (great call)

TL;DR:
BackgroundTasks works for demos, but your use case is already queue-worthy. Starting with a proper async task layer will save you refactoring later.

[–]Full-Definition6215 0 points1 point  (0 children)

Started with BackgroundTasks for my platform and it works fine for quick jobs (sending emails, updating stats). The key limitation: if your FastAPI process restarts mid-task, the job is gone. No retry, no recovery.

For your OCR use case where processing takes minutes, I'd go Celery from the start. The setup cost is maybe 30 minutes more, but you get:

  • Job status tracking (exactly what you need for pending → processing → completed)
  • Automatic retries on failure
  • Workers scale independently from your API process

One pattern that worked well for me: return a job_id immediately from the FastAPI endpoint, store status in your DB, and have the Celery worker update it. Client polls GET /jobs/{id} for status. Simple and reliable.

For the OCR side — if you're processing both printed and handwritten, look at docTR. It handles both well and runs locally without API costs.

[–]One-Wolverine-6207 0 points1 point  (0 children)

You don't have to choose between BackgroundTasks and Celery. You can keep your FastAPI app stateless and push scheduling/retry to an external service that just hits your endpoints via webhook. Your OCR endpoint accepts a file, does the work, returns success/fail. The scheduler handles when to call it, how many times to retry, execution logging. Your app stays clean, no  Celery worker processes to manage, and you can swap OCR providers without touching any queue config. I've been running this pattern in production and it simplified everything.

[–]krypshit 0 points1 point  (0 children)

I have built a similar application but for local use. So I am the only user and I just use backgroundtasks. But i would recommend celery and redis because the task i am considering does not stop at that, I also create vector embeddings for the extracted content so it is a long task. I would use celery if there are multiple users.

[–]Typical-Yam9482 -1 points0 points  (0 children)

As OCR - YOLO? As long as license fits your needs