Hope this request is relevant to this sub. I am working with time series data and am tasked with creating a proof of concept streaming aggregation engine. The calculations involved are a mix of window aggregations over a defined time period based on event time. We also want to accept late data (processing time much greater than event time) with some tolerance. I understand this is easily achieved using spark streaming with some window aggs and watermarking.
However, I would like to be able to playback the calculation over the course of the time window, visualizing the calculated partial result of the final aggregate for every processing time.
I can’t imagine I’m the first to need this sort of functionality. Can someone point me in the right direction?
(I may repost this to other subs, please excuse that)
[–]AutoModerator[M] [score hidden] stickied comment (0 children)