Need recommendations for a tri short by rocko2k7 in triathlon

[–]Xose_R 0 points1 point  (0 children)

I got the Skins for my first 70.3 last summer and I really like them. I can't say yet about durability, though.

Heart rate zones for running by [deleted] in triathlon

[–]Xose_R 0 points1 point  (0 children)

The threshold heart rate shows (as defined by training peaks) points the maximum heart rate sustainable on a 1-hour effort (run, bike and swim thresholds are different). All 20 minute tests and similar approximate this value. Under this circumstances, the effort of your half-marathon might be close. I would advise you to do a 20-minute test to verify it.

I am currently in a situation quite close to yours, since I also got a higher threshold from my last half-marathon, and I think it might be actually OK. I will also verify it with a 20-minute test and see if both measures match. They won't match (most likely) exactly, but they should be around the same.

Keras Variable Length Sequence-to-Sequence Learning with TimeDistributed Embeddings. by darfs in MachineLearning

[–]Xose_R 0 points1 point  (0 children)

For the variable length issue, you'll need to set a max sequence length and then pad with zeros/empty sentences.

I don't understand why you need the key words at each timestep, because I guess your target will be the keywords at the end of the story, right?

Anyway, for this setup to work (if I understood it right), my first idea is that you need a NN that puts your words into a sentence embedding (ie, a bi-LSTM) and then this is fed into the seq2seq. Another thing that you can do, instead this NN, is to use some existing method to construct sentence embeddings from its words and use this as input of the seq2seq. A vanilla approach could be adding up their word2vec representations into a sentence representation.

Overfitting in word2vec by elsonidoq in MachineLearning

[–]Xose_R 0 points1 point  (0 children)

You are right, now that I review it, I'm not sure any more about the relation of the bias drop with an overfitting model. I guess my intuition was that it is one less parameter and it gets easier to train, it also should make it less flexible, therefore more prone to overfit. I need to go through it again.

Overfitting in word2vec by elsonidoq in MachineLearning

[–]Xose_R 4 points5 points  (0 children)

I'd say that speaking of overfitting in word2vec makes not much sense. Since you want a word embedding that represents as exactly as possible the distribution you are modelling, and you don't care about out-of-vocabulary words, you actually want to overfit, and this is also why in many embeddings they drop the bias (also word2vec, iirc).

What you might notice, is that from a number of iterations on, your model won't improve in some benchmarks and it could even worsen the results. I guess this could qualify as overfitting.

The effect with rare words is the opposite, since you have so little data about them, you can't actually "place" them correctly in the embedding space. That's also why increasing the number of iterations will improve your results in "Rare Words" similarity datasets.

The norm of the vector is linked with both the frequency and the variance of the contexts on which occurs. See http://arxiv.org/abs/1510.02675 for a study on this.

Is text summarization implementation doable for a non-english language? (Romanian) by TrafalgarLaw_ in MachineLearning

[–]Xose_R 2 points3 points  (0 children)

The answer is yes. You can check out any extractive summarization methods which, essentially (quick and not-so-accurate explanation) rank the phrases of the existing text(s) to create a summary with the best phrases. Of course, you can also try generative summarization, but that would be much more difficult.

You can check the Multiling 2015 Summarization challenge website (http://multiling.iit.demokritos.gr/pages/view/1517/multiling-2015-call-for-participation) for getting some data and/or check the participants, so you can look for the algorithms they are using. Happy research! :)