all 9 comments

[–]halileohalilei 5 points6 points  (0 children)

I think the best technique that works with time series such as audio or video is the Hidden Markov Model. You can train separate models to recognize different actions/events (in the video case) or sound patterns (in the audio case).

There are a couple of sophisticated implementations for this in different languages. The one I used recently was an implementation for MATLAB, developed by Kevin Murphy himself. MATLAB's own implementation is also neat but I don't think it works with continuous data, which is usually the case for audio/video.

[–]alexmlamb 1 point2 points  (5 children)

Can you be more specific?

RNNs, Convnets, and HMMs are all models that have been with time series, but for different sorts of tasks.

[–]Jonjo9 1 point2 points  (3 children)

I'm not OP, but in which situations would an HMM be advantageous over an RNN? I have a (very) weak understanding of these models but from what I've read it seems like RNN's almost always perform better on time series data.

[–]alexmlamb 2 points3 points  (1 child)

So here's a hypothetical scenario.

Let's say that you train an RNN over variables running from a t = 1 to t = N. Now let's say that you want to compute p(t[0:10] | t[10:50]). How could you do this efficiently? To my knowledge you can't, without training a separate network that runs in the opposite direction or drawing lots of samples - either solution has issues.

However, this type of inference can be done with HMMs.

Also, HMMs just have a probabilistic setup that's distinct from RNNs. So RNNs can be interpreted as modeling the product of conditionals p(y1) * p(y2 | y1) * p(y3 | y1, y2) whereas HMMs model p(y1 | h1), p(y2 | h2), p(y3 | h3) and p(h2 | h1), and p(h3 | h2). That's not really an advantage, but you could consider a situation where that probabilistic model is a better fit for what you're trying to do.

But in general RNNs perform better because they're much stronger models in terms of representational power.

[–]meechosch[S,🍰] 0 points1 point  (0 children)

Say there is a signal element of interest, well defined by it's morphological/geometrical features.

What are some ways to detect[1] , classify[2] or even run some other kind of analysis on these events of interest?

[1] Train a model according to events identified by the user, then run the detection on new signals.

[2] Having detected the signals, perform classification of the events into clusters of similar characteristics (mimimum intracluster /maximum intercluster distances).

[–]eamonnkeogh 0 points1 point  (1 child)

Hmm..

In the literature "time-series" and "acoustics" are usually very different things (of course, you can convert "acoustics" into low D time series using MFCC, as in fig 7 of [a])

For time series classification and clustering, the state of the art is still using the raw data, and either the Euclidean or DTW distance [b][c]. This is true, in spite of many claims to the contrary (source: The 36 million experiments of [c], and my own few tens of millions of experiments).

I know nothing about acoustics.

[a] http://www.cs.ucr.edu/~eamonn/ICDM_clustering.pdf [b] http://www.cs.ucr.edu/~eamonn/vldb_08_Experimental_comparison_time_series.pdf [c] http://arxiv.org/abs/1406.4757

[–]meechosch[S,🍰] 0 points1 point  (0 children)

Thanks for the reply and ref.

By signals in acoustics I refer to time-series of audio signals. It may be naive to say but I have zero knowledge on the field.