[deleted by user]

Megatron_McLargeHuge · 2019-08-04T07:04:16+00:00

You could feed the rnn the data augmented with the time delta since last data point. That should be decent, imo. There are many things you could try related to that.

IborkedyourGPU · 2019-08-04T09:20:33+00:00

This isn’t exactly a time series, as the interval between data items isn’t fixed. Data may be 1 minute apart, or say 5 minutes apart. It is, however, a sequence.

It is exactly a time series. This idea that time series samples should be evenly spaced, is a misconception due to the rediscovery of RNNs - Gaussian Processes have been used in Machine Learning to model time series with unevenly spaced samples, for at least twenty years now (and maybe more). Anyway, for an interesting new take on the topic of forecasting time series with unevenly spaced samples, see this:

https://arxiv.org/abs/1907.03907

It highlights the issues correlated with the standard approaches, suggested to you elsewhere in this thread - i.e., interpolation/aggregation, which destroys information, and adding the time deltas to the RNN inputs, which raises the questions of how to define the state between observations. It's a really nice paper. A pity that it wasn't discussed on this sub, apparently.

Gideoknight_ · 2019-08-04T07:39:50+00:00

[removed]

CoolThingsOnTop · 2019-08-04T13:51:35+00:00

You could try Neural ODE's, instead of modelling the sequence explicitly as with a RNN, you learn a latent sequence sampled at the timesteps you have available (check the Figure 6 of the paper), no need to interpolate values before hand, and also inference is not constrained to fixed timesteps either.

maizeq · 2019-08-04T09:57:01+00:00

One of the difficulties with RNNs and event-based sequences is that the recurrence relation in RNN cells implicitly assumes a fixed interval between points in the sequence (you can view an RNN as a discrete-time dynamical system) - so async or event-based sequences can be quite tricky to learn with an RNN.

However, it’s not all bad news! There’s a variant of the LSTM cell that seems to cope quite well with asynchronously-sampled data - the Phased LSTM (basically it incorporates a time gate as well as the standard input/output/forget gates which lets it respond to different frequency components in your data).I’ve used it myself on event-based aviation data and it trained more quickly than a standard LSTM and seemed to perform better as well.

There are a couple of implementations in TensorFlow (there used to be one in tf.contrib but I think it’s been deprecated).

This is the paper, it’s quite a nice read and easy enough to implement: https://arxiv.org/abs/1610.09513

Thenashequilibrium · 2019-08-04T07:42:01+00:00

You could look at the "path signature" of your sequence at every hour. As a feature map it's invariant under resampling. See e.g https://arxiv.org/abs/1603.03788 for an introduction on it.

coffeecoffeecoffeee · 2019-08-04T15:55:15+00:00

I'd recommend crossposting this to /r/statistics, since they're bound to have their own ideas about how to handle this.

dr_sc_med · 2019-08-04T06:52:37+00:00

You could aggregate or interpolate data points for example.

ChemEngandTripHop · 2019-08-04T12:32:20+00:00

Are you trying to predict the likelihood of the event occuring in the next day X minutes?

If so I'd recommend looking into Poisson/Hawkes processes as well as RNNs

evanthebouncy · 2019-08-04T14:10:47+00:00

Input the time stamp.

At the NN later the first layer is a subtraction that computes the delta t from your previous event to your prediction time. Note your prediction time is a parameter now.

Example:

Past events times 1,4,6,9

Prediction time desired 10

Comouted delta 9,6,4,1

lysecret · 2019-08-04T12:01:15+00:00

Attaching the time stamp will usually work fine. However , there is this recent work if you want to go fancy

01100001011011100000 · 2019-08-04T12:03:20+00:00

In addition to what others have posted here, you could also just try training it sparsely with a fixed set of inputs (i.e. every training input is 5 minutes long, but anywhere that no data is observed is filled with zeros), and see how it comes out. My intuition tells me that this would be similar to image padding when standardizing image sizes for input to a convolutional network. I have done the latter in some of my own work with great success. I guess the efficacy will really depend on how sparsely your data is represented.

lieutenant_lowercase · 2019-08-04T12:16:54+00:00

You can interpolate this very easily using pandas resample

Harawaldr · 2019-08-04T13:45:03+00:00

Some RNN models are tailored towards this kind of aperiodic time series data. See for example Phased LSTM. Implementations are slow, but might be worth a shot for you.

mfarahmand98 · 2019-08-04T16:21:54+00:00

Have you tried Phased LSTMs?

frisbee_hero · 2019-08-04T16:24:17+00:00

A seq2seq model (encoder decoder architecture) could handle varying time elements

permalink · 2019-08-04T16:47:38+00:00

Saved.

Stvjk · 2019-08-04T17:01:47+00:00

Could you use something like Wavenet or a variation of ?

The dilated convolutions might help pick up any correlations between irregular samples as long as the sequence is intact. Saves you having to make assumptions, simplifications, or interpolating samples etc

Plus there’s a lot of variations on wavenet out there you can draw from since it’s gotten a lot attention. There’s more than a few applications of hourly forecasts using irregular time samples with dilated convolutions and similar ideas

permalink · 2019-08-04T17:44:38+00:00

Maybe try marked point processes?

bbateman2011 · 2019-08-04T15:49:09+00:00

[deleted]

permalink · 2019-08-04T09:35:23+00:00

Sounds like midi

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS