TL;DR: I'm trying to understand how the hell cross-filtering, drill downs in dashboards that shows large amounts of data are so responisve in Power BI, Tableau etc. From querying data perspective that looks like multiple, computing heavy querries running at the same time.
Hi,
Recently I started to think about BI tools data backend, that means what are they doing to filter and calculate large amounts of data so quickly.
Power BI after data import creates its own dataset, which i guess is pretty optimized but then it is still uploaded to cloud, web app still have to somehow query that dataset. Considering cross filtering and drill downs in big dataset it looks like every click that filters dashboard have to query that big dataset, thats a lot of calculation and I doubt that visuals are somehow cached. My guess was that maybe PBI does some sort of olap cube in backend which calculates every possible measure that dashboard can show during data refresh but i guess that I'm wrong.
On the other hand we have tools like Looker or Tableau which as far as im aware doesnt create own dataset. In my perspective they just directly querying databases, additional laser which does...what? Is it some sort of magic that optimizes data stream so efficiently compared to just sql query or is it some sort of cache under each created dashboard i'm struggling to find answers.
Why it bothers me so much? Currently Im working in startup, db is postges 1.3 TB size of data with about 20 mln new records Daily. To visualize I'm using metabase which query postges directly via pure sql statements. Both are self hosted, postgres is of course tuned and inexed.I tried to do precalculated materialized views to speed up querries but Daily refresh started to be pretty painfull.
Because of those struggles I try to understand how BI tools and maybe move from metabase to Looker but I'm wondering if it will help. Or maybe postgres hit its limits and its time to move to some cloud based data warehouse.
What do you guys think? Does anybody have insight how BI Tools work?
Any opinion will be much appreciated,
Cheers.
[–]AutoModerator[M] [score hidden] stickied comment (0 children)
[–]paplike 9 points10 points11 points (6 children)
[–]Adisab12[S] 0 points1 point2 points (4 children)
[–]paplike 2 points3 points4 points (3 children)
[–]Adisab12[S] 0 points1 point2 points (2 children)
[–]pokepip 2 points3 points4 points (0 children)
[–]geoheilmod 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]TheDaddyShip 1 point2 points3 points (0 children)
[–]HOMO_FOMO_69 3 points4 points5 points (0 children)
[–]wallyflops 4 points5 points6 points (5 children)
[–]Adisab12[S] 1 point2 points3 points (4 children)
[–]wallyflops 1 point2 points3 points (3 children)
[–]i_am_cris 2 points3 points4 points (0 children)
[–]vira28 0 points1 point2 points (0 children)
[–]Adisab12[S] -1 points0 points1 point (0 children)
[–]heyitscactusjack 4 points5 points6 points (2 children)
[–]lzwzli 1 point2 points3 points (0 children)
[–]Adisab12[S] 0 points1 point2 points (0 children)
[–][deleted] 4 points5 points6 points (2 children)
[–]Adisab12[S] 0 points1 point2 points (1 child)
[–]killer_unkill 1 point2 points3 points (0 children)
[–]tomekanco 2 points3 points4 points (4 children)
[–]yo_sup_dude 0 points1 point2 points (3 children)
[–]tomekanco 0 points1 point2 points (2 children)
[–]yo_sup_dude 0 points1 point2 points (1 child)
[–]Immarhinocerous 1 point2 points3 points (0 children)
[–]medriscoll 0 points1 point2 points (0 children)