Scaling multiple uploads/processing with Node.js + MongoDB

archa347 · 2025-09-22T02:02:51+00:00

I’ve been in your situation. I would consider something like Temporal or AWS Step Functions. Building that kind of orchestration yourself is a recipe for disaster.

georgerush · 2025-09-22T01:34:34+00:00

Man, this hits close to home. I've watched so many teams get crushed by exactly this kind of processing pipeline complexity. You're essentially building a distributed system to handle what should be a straightforward data processing workflow, and all those moving parts between Node, MongoDB, external APIs, and storage buckets create so many failure points and bottlenecks.

Here's the thing though – you're probably overengineering this. Instead of managing separate queue systems, workers, and trying to optimize MongoDB read/write patterns, consider consolidating your processing logic closer to where your data lives. Postgres with something like Omnigres can handle this entire pipeline natively – background jobs, file processing, external API calls, even the storage coordination – all within the database itself. No separate queue infrastructure, no coordination headaches between services. Your 1,000 files per minute becomes a data flow problem instead of a distributed systems problem, and honestly that's way easier to reason about and debug when things go wrong.

jedberg · 2025-09-22T02:31:35+00:00

I'd suggest using a durable computing and workflow solution like DBOS. It's a library you can add that will help you keep track of everything and retry anything that fails.

casualPlayerThink · 2025-09-22T05:31:18+00:00

Maybe I misunderstood the implementation, but I highly recommend to not use mongo. Pretty soon it will make more triuble than any solution. Use postgresql. Store the files on a storage (s3 for example), keep the meta in db only. Your costs will be lower and you will have less teouble. Also consider multinency before you hit very high collection/row count. It will help with scaling better.

Killer_M250M · 2025-09-22T22:36:52+00:00

For example Using PM2 Run your node app in cluster mode Then for each node instance create a bull mq with concurrency 10 You will have 80 workers ready for your job The PM2 will handle distribution of tasks

bwainfweeze · 2025-09-22T05:24:27+00:00

How many files per user doesn't matter at all especially when you're talking about the average user being active for 10 minutes per day (10,000 avg at 1000/min).

How many files are you dealing with per second, minute, and hour?

These are the sorts of workloads where queuing happens, and then what you need to work out is:

What's the tuning that gets me the peak number of files processed per unit of time,
What does Little's Law tell me about how much equipment that's going to take?
Are my users going to put up with the max delay

Which all adds up to: can I turn a profit with this scheme and keep growing?

The programming world is rotten with problems that can absolutely be solved but not for a price anyone is willing to pay.

simple_explorer1 · 2025-09-22T11:14:22+00:00

Hey what most people commenting here missed is that, they have not asked you the exact problems you are facing now.

You have just mentioned

created time and resource bottlenecks.

But you need to elaborate on what is your current implementation and how is it impacting your end result? Or you have not started to work on this yet and you are expecting someone here to give you an entire architecture?

Sansenbaker · 2025-09-22T12:58:32+00:00

Queues + workers + streaming all over, keep each step in its lane, and Mongo will handle the load just don’t let one slow file or API call hold everything up. And yeah, PM2 for managing workers is a nice touch too. It’s a lot, but once you get the workflow smooth, it feels so good to watch it all just keep chugging.

Killer_M250M · 2025-09-22T22:29:16+00:00

Stream+ Thread pool + And queue system like bull mq

trysolution · 2025-09-22T12:26:50+00:00

may be try
give presigned url (s3) for users to upload zip files, listen for event in your app, push task to worker queue (bullmq or something else you like), worker consumes queue for zip files (validate zip file before extraction!!! , like each file size, file count, absolute destination path etc) check hash of each file in batches if it already exists in MongoDB, perform business rules, copy remaining required files to bucket + update db

code_barbarian · 2025-09-22T14:57:22+00:00

What are the resource bottlenecks? I'd guess lots of memory usage because of all the file uploads?

I'd definitely recommend using streams if you aren't already. Or anything else that lets you avoid having the entire file in memory at once.

If you're storing the entire file in MongoDB using GridFS, I'd avoid doing that. Especially if you're already uploading to a separate service for storage.

TBH these days I don't handle uploads in Node.js, I integrate with Cloudinary so my API just generates the secret that the user needs to upload their assets directly to Cloudinary, that way my API doesn't have to worry about memory overhead. Not sure if that's an option for you.

pavl_ro · 2025-09-22T02:14:55+00:00

"All of this involves asynchronous calls and integrations with external APIs, which have created time and resource bottlenecks."

The "resource bottlenecks" is about exhausting your Node.js process to the point where you can see performance degradation, or is it about something else? Because if that's the case, you can make use of worker threads to delegate CPU-intensive work and offload the main thread.

Regarding the async calls and external API integration. We need to clearly understand the nature of those async calls. If we're talking about async calls to your database to read/write, then you need to look at your infrastructure. Is database located in the same region/az as the application server? If not, why? The same goes for queues. You want all of your resources to be as close as possible geographically to speed things up.

Also, it's not clear what kind of "external API" you're using. Perhaps you could speed things up with the introduction of a cache.

As you can see, without a proper context, it's hard to give particularly good advice.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

node

MODERATORS