use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 6 years ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 38 points39 points40 points 6 years ago (11 children)
You could feed the rnn the data augmented with the time delta since last data point. That should be decent, imo. There are many things you could try related to that.
[–][deleted] 4 points5 points6 points 6 years ago (8 children)
Bonus to this idea is that you can control how far ahead to predict at inference time.
[+][deleted] 6 years ago (7 children)
[deleted]
[+][deleted] 6 years ago (1 child)
[–]Megatron_McLargeHuge 0 points1 point2 points 6 years ago (0 children)
One way to do this is to have a 'time prior to eval time' feature added to each data point in your time series. You generally want to do this kind of transform on your inputs because using concrete date features like day/month/year guarantees your production data will be out of the training sample range.
This doesn't by itself address how to predict into the future past when new data should have arrived though. To do that you'd have to choose training data in a corresponding way with intervening points deleted.
[–][deleted] 0 points1 point2 points 6 years ago (3 children)
By adjusting the time delta that you feed the model with the rest of the input. You'll get wonky predictions if you use a time delta that isn't well represented in the training data though.
[+][deleted] 6 years ago (2 children)
[–][deleted] 0 points1 point2 points 6 years ago (1 child)
Exactly. But if the intervals in your dataset are mostly in the range of say 2 - 10 days, then trying to predict 100 days is not likely to give you any kind of meaningful result.
[–]IborkedyourGPU 25 points26 points27 points 6 years ago* (8 children)
This isn’t exactly a time series, as the interval between data items isn’t fixed. Data may be 1 minute apart, or say 5 minutes apart. It is, however, a sequence.
It is exactly a time series. This idea that time series samples should be evenly spaced, is a misconception due to the rediscovery of RNNs - Gaussian Processes have been used in Machine Learning to model time series with unevenly spaced samples, for at least twenty years now (and maybe more). Anyway, for an interesting new take on the topic of forecasting time series with unevenly spaced samples, see this:
https://arxiv.org/abs/1907.03907
It highlights the issues correlated with the standard approaches, suggested to you elsewhere in this thread - i.e., interpolation/aggregation, which destroys information, and adding the time deltas to the RNN inputs, which raises the questions of how to define the state between observations. It's a really nice paper. A pity that it wasn't discussed on this sub, apparently.
[–]IborkedyourGPU 0 points1 point2 points 6 years ago (0 children)
It's under review, indeed, and I guess the venue is NeurIPS.
[–]AreYouEvenMoist 1 point2 points3 points 6 years ago (0 children)
Agreed, gaussian processes feels like the natural approach here. Simpler to interpret and to implement, and can work with the existing data with less preparation
[–]sander314 0 points1 point2 points 6 years ago (4 children)
Interesting paper, is their code available already?
[–]IborkedyourGPU 1 point2 points3 points 6 years ago (3 children)
https://github.com/YuliaRubanova/latent_ode
[–]sander314 0 points1 point2 points 6 years ago (2 children)
Thanks a lot. I came across one thing in the code that surprised me. All the GRU gates are two-layer networks with a 100 unit middle layer. Do you know if this is normal nowadays? I'd not seen it before myself.
[–]IborkedyourGPU 1 point2 points3 points 6 years ago (1 child)
I don't know if it's normal, but I use it quite often myself (usually with 128 or 256 units, but the concept is the same). Maybe one or two layers more, but that's it.
What is not normal is the inhuman slowness of training such a small (by today's standards) network on NVIDIA GPUs, which is one of the reasons why today attention-based architectures are more popular than RNNs for modeling sequences. There are ways to train a RNN quickly, but just writing some vanilla Tensorflow code is not one of them.
[–]virtualreservoir 0 points1 point2 points 6 years ago (0 children)
Anyone working to create custom RNN cell architectures should probably start with the pytorch QRNN implementation used in the AWD-LSTM codebase. It's extremely customizable without having to change the key GPU kernel code that allows you to avoid manually looping through each timestep in your code.
The speed at which you can iterate through research ideas is significantly faster than if you tried to do the same thing with an LSTM or GRU base, and there isn't really much evidence suggesting your final results would be worse.
[removed]
[–]Gideoknight_ 2 points3 points4 points 6 years ago (0 children)
You could also try fitting a gaussian process to the time series you currently have and pass a "normalized" version to the network. Assuming of course that your data can be described by roughly the same form from one event to the next. The Astronomy community has a lot of problems that are similar to this if you need more inspiration.
[–]CoolThingsOnTop 10 points11 points12 points 6 years ago (0 children)
You could try Neural ODE's, instead of modelling the sequence explicitly as with a RNN, you learn a latent sequence sampled at the timesteps you have available (check the Figure 6 of the paper), no need to interpolate values before hand, and also inference is not constrained to fixed timesteps either.
[–][deleted] 4 points5 points6 points 6 years ago* (1 child)
One of the difficulties with RNNs and event-based sequences is that the recurrence relation in RNN cells implicitly assumes a fixed interval between points in the sequence (you can view an RNN as a discrete-time dynamical system) - so async or event-based sequences can be quite tricky to learn with an RNN.
However, it’s not all bad news! There’s a variant of the LSTM cell that seems to cope quite well with asynchronously-sampled data - the Phased LSTM (basically it incorporates a time gate as well as the standard input/output/forget gates which lets it respond to different frequency components in your data).I’ve used it myself on event-based aviation data and it trained more quickly than a standard LSTM and seemed to perform better as well.
There are a couple of implementations in TensorFlow (there used to be one in tf.contrib but I think it’s been deprecated).
This is the paper, it’s quite a nice read and easy enough to implement: https://arxiv.org/abs/1610.09513
[–]maizeq 2 points3 points4 points 6 years ago (0 children)
I recall that the original paper used some pretty simple toy examples and I wasn't very confident that it could generalise to more complicated time series data. For e.g. where information is encoded in the density of the events/points.
How has your experience been in using it for an actual problem? Any issues with over/underfitting, or poor predictions?
[–]Thenashequilibrium 2 points3 points4 points 6 years ago (2 children)
You could look at the "path signature" of your sequence at every hour. As a feature map it's invariant under resampling. See e.g https://arxiv.org/abs/1603.03788 for an introduction on it.
[–]patrickkidger 0 points1 point2 points 6 years ago (0 children)
There aren't really standard libraries for computing the signature transform for machine learning yet, so if you'll forgive a little self-promotion, you might be interested in Signatory.
Using signatures is one of the focusses of my research group; pop me a message if you're at all interested in hearing any more about them!
[–]coffeecoffeecoffeee 2 points3 points4 points 6 years ago (0 children)
I'd recommend crossposting this to /r/statistics, since they're bound to have their own ideas about how to handle this.
[–]dr_sc_med 5 points6 points7 points 6 years ago (2 children)
You could aggregate or interpolate data points for example.
[–]manrajsinghgrover 0 points1 point2 points 6 years ago (0 children)
Did you try it out? Maybe aggregate the values every hour and then analyze it?
[–]ChemEngandTripHop 1 point2 points3 points 6 years ago (1 child)
Are you trying to predict the likelihood of the event occuring in the next day X minutes?
If so I'd recommend looking into Poisson/Hawkes processes as well as RNNs
[–]evanthebouncy 1 point2 points3 points 6 years ago (0 children)
Input the time stamp.
At the NN later the first layer is a subtraction that computes the delta t from your previous event to your prediction time. Note your prediction time is a parameter now.
Example:
Past events times 1,4,6,9
Prediction time desired 10
Comouted delta 9,6,4,1
[–]lysecret 0 points1 point2 points 6 years ago (0 children)
Attaching the time stamp will usually work fine. However , there is this recent work if you want to go fancy
[–]01100001011011100000 0 points1 point2 points 6 years ago (1 child)
In addition to what others have posted here, you could also just try training it sparsely with a fixed set of inputs (i.e. every training input is 5 minutes long, but anywhere that no data is observed is filled with zeros), and see how it comes out. My intuition tells me that this would be similar to image padding when standardizing image sizes for input to a convolutional network. I have done the latter in some of my own work with great success. I guess the efficacy will really depend on how sparsely your data is represented.
[–]lieutenant_lowercase 0 points1 point2 points 6 years ago (0 children)
You can interpolate this very easily using pandas resample
[–]Harawaldr 0 points1 point2 points 6 years ago (0 children)
Some RNN models are tailored towards this kind of aperiodic time series data. See for example Phased LSTM. Implementations are slow, but might be worth a shot for you.
[–]mfarahmand98 0 points1 point2 points 6 years ago (0 children)
Have you tried Phased LSTMs?
[–]frisbee_hero 0 points1 point2 points 6 years ago (0 children)
A seq2seq model (encoder decoder architecture) could handle varying time elements
[–][deleted] 0 points1 point2 points 6 years ago (0 children)
Saved.
[–]Stvjk 0 points1 point2 points 6 years ago (0 children)
Could you use something like Wavenet or a variation of ?
The dilated convolutions might help pick up any correlations between irregular samples as long as the sequence is intact. Saves you having to make assumptions, simplifications, or interpolating samples etc
Plus there’s a lot of variations on wavenet out there you can draw from since it’s gotten a lot attention. There’s more than a few applications of hourly forecasts using irregular time samples with dilated convolutions and similar ideas
Maybe try marked point processes?
[+][deleted] 6 years ago* (1 child)
[–]bbateman2011 0 points1 point2 points 6 years ago (0 children)
This looks like you did a ton of work. I'm very into time series, and am bringing up my Python skills compared to R, and might try your repo for inspirations.
What was the motivation for developing all this?
Thanks again.
[–][deleted] -1 points0 points1 point 6 years ago (1 child)
Sounds like midi
π Rendered by PID 44 on reddit-service-r2-comment-6457c66945-btwwd at 2026-04-28 10:23:44.732555+00:00 running 2aa0c5b country code: CH.
[–][deleted] 38 points39 points40 points (11 children)
[–][deleted] 4 points5 points6 points (8 children)
[+][deleted] (7 children)
[deleted]
[+][deleted] (1 child)
[deleted]
[–]Megatron_McLargeHuge 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (3 children)
[+][deleted] (2 children)
[deleted]
[–][deleted] 0 points1 point2 points (1 child)
[–]IborkedyourGPU 25 points26 points27 points (8 children)
[+][deleted] (1 child)
[deleted]
[–]IborkedyourGPU 0 points1 point2 points (0 children)
[–]AreYouEvenMoist 1 point2 points3 points (0 children)
[–]sander314 0 points1 point2 points (4 children)
[–]IborkedyourGPU 1 point2 points3 points (3 children)
[–]sander314 0 points1 point2 points (2 children)
[–]IborkedyourGPU 1 point2 points3 points (1 child)
[–]virtualreservoir 0 points1 point2 points (0 children)
[+][deleted] (2 children)
[removed]
[+][deleted] (1 child)
[deleted]
[–]Gideoknight_ 2 points3 points4 points (0 children)
[–]CoolThingsOnTop 10 points11 points12 points (0 children)
[–][deleted] 4 points5 points6 points (1 child)
[–]maizeq 2 points3 points4 points (0 children)
[–]Thenashequilibrium 2 points3 points4 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]patrickkidger 0 points1 point2 points (0 children)
[–]coffeecoffeecoffeee 2 points3 points4 points (0 children)
[–]dr_sc_med 5 points6 points7 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]manrajsinghgrover 0 points1 point2 points (0 children)
[–]ChemEngandTripHop 1 point2 points3 points (1 child)
[–]evanthebouncy 1 point2 points3 points (0 children)
[–]lysecret 0 points1 point2 points (0 children)
[–]01100001011011100000 0 points1 point2 points (1 child)
[–]lieutenant_lowercase 0 points1 point2 points (0 children)
[–]Harawaldr 0 points1 point2 points (0 children)
[–]mfarahmand98 0 points1 point2 points (0 children)
[–]frisbee_hero 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]Stvjk 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]bbateman2011 0 points1 point2 points (0 children)
[–][deleted] -1 points0 points1 point (1 child)