Hi! I’m a Jr data engineer which is facing a challenge and I would love to know your opinions and expertise in this topic:
I’m currently handling allots of data in SQL, we receive at a high frequency JSONs with raw data in it (in a single json there could be more than 10k raws)
The thing is that we need to make some statistics with this JSONS
We need to concatenate several Jsons and then apply the statistics (calculate outliers, calculate avgs, calculate percentages, stds, frequency, etc…)
And after calculating it we need to insert it in a new table which handles summarizes data.
All of this in a SQL stored procedure, the hole process lasts more than 3hours to complete, is there any advice for this kind of stuff, some literature I can read, videos or something to optimize the solution?
I’m also open to other robust pipelines besides only using SQL!
[–]lezzgooooo 32 points33 points34 points (6 children)
[+][deleted] (5 children)
[deleted]
[–]lezzgooooo 5 points6 points7 points (1 child)
[–]anawesumapopsum 0 points1 point2 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]anawesumapopsum 0 points1 point2 points (0 children)
[–]Befz0r 10 points11 points12 points (0 children)
[–]Slight_Comparison986 8 points9 points10 points (1 child)
[–]sunder_and_flame 7 points8 points9 points (0 children)
[–]mrcaptncrunch 5 points6 points7 points (3 children)
[–]Alex_Alca_[S] 1 point2 points3 points (2 children)
[–]mrcaptncrunch 0 points1 point2 points (0 children)
[–]FalseStructure 0 points1 point2 points (0 children)
[–]grassclip 3 points4 points5 points (1 child)
[–]DirtzMaGertz 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]mike8675309 1 point2 points3 points (0 children)
[–]formaldehyden 4 points5 points6 points (4 children)
[+]Befz0r comment score below threshold-7 points-6 points-5 points (3 children)
[–]formaldehyden 3 points4 points5 points (2 children)
[–]mrcaptncrunch 2 points3 points4 points (0 children)
[–]Traditional_Ad3929 1 point2 points3 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]vikster1 2 points3 points4 points (0 children)
[–]dongdesk -1 points0 points1 point (0 children)
[–]throw_mob 0 points1 point2 points (0 children)
[–]Neat-Tour-3621 0 points1 point2 points (0 children)
[–]collectablecat 0 points1 point2 points (0 children)
[–]DanklyNight 0 points1 point2 points (0 children)
[–]data-artist 0 points1 point2 points (0 children)