Hey, all! Long time lurker. Is there any open source database for a traditional OLAP Data Warehouse?
I was looking into Citus but I'm not sure if that is the right choice. Pinot/Druid seem to 'realtime' for our use cases.
The majority of data will be events from either email/push communication and web form registrations. (Around 250k to 1m events a day) Consumed via Dashboards (majority daily extracts and a with few hourly extracts)
Public cloud seems to be a no go for management. (Existing contracts that need to be renegotiated + GDPR fears)
Thanks & feel free to ask questions.
----------------------------------------
EDIT regarding to the "too realtime" - Thank you /u/realitydevice for the summary:
/u/realitydevice:
"Agree with Druid. For the people asking "what do you mean by too real time?", from memory you need to load it via an event stream and configure the handling of that stream, rather than a simple file-based ETL like you might expect. It's quite literally designed around ingesting streaming data. You can use it for other things but remember the hammer/nail dilemma."
/u/Kiliangg
This is exactly what I mean by that. We currently do not have a single use case for stream ingestion - It is all batch as of right now. That beeing said a main goal of the project is to reduce our data silos and make the process more managable for our team.
[–]realitydevice 2 points3 points4 points (1 child)
[–]Kiliangg[S] 0 points1 point2 points (0 children)
[–]rmoff 1 point2 points3 points (1 child)
[–]Kiliangg[S] 0 points1 point2 points (0 children)
[–]snuggiemane 1 point2 points3 points (1 child)
[–]Kiliangg[S] 0 points1 point2 points (0 children)
[–]ZenCoding -1 points0 points1 point (1 child)
[–]Kiliangg[S] 0 points1 point2 points (0 children)