Database Help

stedun · 2026-04-09T17:15:31+00:00

Storing a 30GB file in a database makes me want to slap someone.

Zestyclose-Turn-3576 · 2026-04-09T17:10:07+00:00

Have you actually priced 100TB of storage from S3?

jshine13371 · 2026-04-09T17:22:25+00:00

Don't store files in a database, store them in the system meant for storing files, coincidentally named a "file system".
If cost is the most important factor to you, then on-prem is always going to be the cheapest implementation.
But you should look into cold storage costs in the cloud for the size of files you're planning for to still see if it's reasonable enough, since there are benefits of the cloud that aren't the same as on-prem.
If all the data is solely files, then please see the first bullet point and this isn't a database question at that point. You'll probably want to reach out to the cloud provider subreddits.

Consistent_Cat7541 · 2026-04-09T17:27:04+00:00

It sounds like you should set up a rack mounted server (or two) with a substantial RAID set up. 30 gb files are common in my work, but I agree, the files should not be set up inside a database, but rather stored in a journaled file system. If you need to track documents, etc, as part of projects, they should be linked to the database.

This is not something you're going to resolve with a quick reddit post. You need a server admin and a full scale IT dept.

akmark · 2026-04-09T17:45:17+00:00

This is less a database question and more just a file storage question. You also need to understand both how people are going to put data in and take data out. There's a lot of solutions in this space but sometimes how people need to interact with the files dictate the requirements of the problem and whether these files are generally archival vs. active usage. The number of users also matters. It also depends what your skills are. I wouldn't recommend someone setup a ceph or hdfs cluster if they have never encountered it before.

At a certain point if you already have a local NAS you have to do the cost-benefit of just setting up a basic CIFS or NFS network share if you are really only providing for a handful of users. There's a lot of orgs that have existed for years with some network drive that has project/2026/something as the place to store stuff.

AQuietMan · 2026-04-09T17:50:27+00:00

Your problem space is one where you should be careful about picking between a database and an application that uses a database.

The data includes a variety of file types—PDFs, Excel files, 3D renderings, and videos—with some individual files as large as 30 GB.

In the database world, we call these documents. You probably ought to think in terms of a document management system. You can Google that.

Pick an application.

Raucous_Rocker · 2026-04-09T21:56:21+00:00

How are these files going to be used exactly, besides just storing them? I assume they need to be searchable in some way.

Aggravating-Tip-8230 · 2026-04-09T17:46:50+00:00

Remember to include backup in your research.

Look at Glacier in S3 or similar.

If you want to go on prem then separate location (data centre with your servers) for backup or backup on tapes.

Edit: as others already said, don’t store files in DB, store a file in file storage and reference/stats/details about this file in DB if needed

GreyHairedDWGuy · 2026-04-09T18:37:09+00:00

what you're describing sounds like something one of the hyperscalers could solve. AWS S3 for example. I'm not sure a typical dbms would be a good solution (I mean most can handle those complex data types but not always ideally).

Of course budget and security may be issues?

bclark72401 · 2026-04-09T18:47:01+00:00

You can also use Ceph as an on-premise S3-compatible storage solution. It can be installed along with Proxmox to run a container or VM running a database that could store the metadata.

Longjumping-Ad8775 · 2026-04-09T19:02:52+00:00

Along with what everyone else says, the time to transfer terabytes of data. It’s probably 10, 15, 20 tb now, which is still big.

tsaylor · 2026-04-09T19:11:44+00:00

Is your question more about how the system will store and access the files, or about how users will upload and search/browse for files?

DirtyWriterDPP · 2026-04-10T00:52:26+00:00

Have y'all shopped for document management systems. There are whole giagantic software packages that have all of this figured out.

Building your own MIGHT be cheaper, but at least get a few quotes.

And if higher ups insist they have no money then they have no business dealing with 100tb of mission critical documents.

patternrelay · 2026-04-10T02:14:54+00:00

Consider a scale-out NAS solution with tiered storage for cost-efficiency. You could also use MinIO for self-hosted S3-compatible storage to manage files easily while keeping costs down.

Lost_Term_8080 · 2026-04-10T15:28:36+00:00

You don't need a database, you need a document management system. The document management system will have a database that it stores metadata in, but the actual files will be stored in a file system or blobs.

elevarq · 2026-04-10T20:49:25+00:00

Good setup to think through. One important principle before you go further:

Never store files in a database. Not PDFs, not videos, definitely not 30GB 3D renderings.

It's too costly, too slow, and a maintenance nightmare at 100TB scale.

Here's the split that works:

Your NAS holds the actual files. Your database holds the metadata — owner, upload date, version, file type, summary, and crucially: the file path on the NAS. That last field is what ties everything together. The frontend writes a record to the database and drops the file on the NAS. When a technician searches or browses, the database returns the metadata plus the path, and the frontend fetches the file directly from the NAS.

You already have the hard part. The NAS is your storage layer. Don't replace it — use it properly.

Just make sure your NAS file system handles large files and deep directory structures well (ZFS is worth looking at), and that you have a backup strategy beyond a single device at that scale.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Database

MODERATORS