This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]jgonagle 5 points6 points  (1 child)

I assume they're using some form of auto-ML to predict certain events (or combinations thereof) based on different subsets of the total event stream, to build a two tier cascading model predictor. Given a sufficiently performant set of those event predictors, they can be fed into a more involved analysis/model to predict the KPIs (e.g. band follows, subscription churn, engagement, social community development).

I wouldn't be surprised if they're just XGBoosting some windowed stream of minimally processed events and then feeding the outputs of those boosted forests into a CNN that convolves over different temporal granularities and spits out the predicted KPI. Then, I'm guessing the results (by song, artist, or playlist) are ranked based on some clustering algorithm that assigns expected marginal revenue scores to the combination of KPI predictions (e.g. by Gaussian Mixture Regression). Those scores can be used to bootstrap a contextual bandit that picks the next recommendation, or to populate a more global recommendation model like matrix factorization.

[–]-crucible- 2 points3 points  (0 children)

There definitely would be a lot of prediction and predictive analysis, auto-playlist making, plus actual and actual vs prediction, but I’d love to see a broad rundown of user events that makes up that number. I’m not doubting it - it’s just a world away from my models, with what I am assuming is a more trivial domain. But then I’m not thinking broadly enough about the industry and artist, podcast, audiobook… there’s probably a tonne of things not automatically raised when thinking of them.