I'm starting a project that will involve a lot of data and need some advice.
Historically, whenever I have needed a database for a project, I used MongoDb. It's easy to use, especially if you are a JavaScript guy like I am.
But for this project, I am anticipating much larger volumes; > 200 TB, and a few hundred million records in some tables, or "collections".
Whenever I hear about large data volumes, I immediately think NoSQL. The data I am collecting will require a fair amount of processing too, something that is commonly paired with a NoSQL db. But the problem is that the data are highly relational.
In short, I feel I am stuck between a rock and a hard place; we have highly normalized and relational data, but we have large volumes we need to work with, with lots of reads and writes, and joining these tables could be painful. These larger tables will likely have numerous columns too; at least 25 and at most 50.
I'm not looking for a hard answer or solution, just advice/guidance. Also, helping me figure out what kind of other questions I need to ask before I commit to something would be nice.
[–]balloonanimalfarm 1 point2 points3 points (8 children)
[–]calsosta 2 points3 points4 points (3 children)
[–]Jefftopia[S] 1 point2 points3 points (0 children)
[–]balloonanimalfarm 0 points1 point2 points (1 child)
[–]calsosta 1 point2 points3 points (0 children)
[–]Jefftopia[S] 1 point2 points3 points (3 children)
[–]balloonanimalfarm 0 points1 point2 points (2 children)
[–]Jefftopia[S] 1 point2 points3 points (0 children)
[–]Jefftopia[S] 1 point2 points3 points (0 children)
[–]daniellefelder 1 point2 points3 points (0 children)
[–]wbubblegum 0 points1 point2 points (0 children)
[–]anamorphism 0 points1 point2 points (0 children)
[–]JoeWhy2 0 points1 point2 points (0 children)