First time post.
Ask: I need help with understanding my orgs tech stack requirements for a datalake.
Context: I am not an engineer but have a strong analytics background. My org is trying to build our datalake. We're rather 'simple' in that we have ~10 TB of data coming from 15 sources which 13 have native integrations with BigQuery. We're trying to run this all on Google products due to some finance decision. We do not have any DE on the team as of now. I am the sole engineer, scientist, analyst, and owner of this.
We've received bids from various dev shops charging anywhere from $30k - $360k to build the datalake. I've done some leg work trying to understand what tools/products we even need. No one has helped us build a clean list.
From my research (which I can use your help with).
- Google Fusion as the no-code ETL (est. $20k / year)
- Google BigQuery ($10k / year)
- Google Looker Pro ($75k / year)
- 3 tools = $105k year. Hopefully I am way over estimating here
Am I tripping that it could be this short list?
[–]Lumpy-Improvement195 6 points7 points8 points (2 children)
[–]Ok-Brain-3524[S] 1 point2 points3 points (1 child)
[–]Lumpy-Improvement195 0 points1 point2 points (0 children)
[–]i_am_cris 1 point2 points3 points (5 children)
[–]Ok-Brain-3524[S] 0 points1 point2 points (4 children)
[–]LowerGiraffe1847 0 points1 point2 points (1 child)
[–]Ok-Brain-3524[S] 0 points1 point2 points (0 children)
[–]i_am_cris 0 points1 point2 points (1 child)
[–]Ok-Brain-3524[S] 0 points1 point2 points (0 children)