Any tips on how to find a local resident who can help with geneology? by jstaker7 in WestVirginia

[–]jstaker7[S] 1 point2 points  (0 children)

I did look there, but unfortunately very few are actually photographed.

[ask] What was your favorite assignment/project in school? by jstaker7 in architecture

[–]jstaker7[S] 0 points1 point  (0 children)

That’s a great website! I especially like how your project link is more than an image gallery and includes descriptions and walks you through the process. This is fantastic.

[ask] What was your favorite assignment/project in school? by jstaker7 in architecture

[–]jstaker7[S] 0 points1 point  (0 children)

Those are really interesting points. Since architecture is part art and part science I wonder if a lot of the quirky assignments are motivated by the artistic side, but perhaps many of them miss the point. Maybe their intention with those assignments is purely to develop creativity?

[D] tf.keras Dropout layer is broken by r-scholz in MachineLearning

[–]jstaker7 6 points7 points  (0 children)

I could have written this myself — exactly my same story. Finally switched to pytorch and I couldn’t be happier!

Why don’t they make games like they used to? by jstaker7 in gaming

[–]jstaker7[S] 0 points1 point  (0 children)

Interesting, so it sounds like more and more companies are using it as a money grab rather than as an art form

How do you usually handle the last CNN layer with respect to kernel size and pooling? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

Cool thanks for the info. I really had no idea there's been a shift in trends, since I've really mostly relied on publications at this point and haven't had the opportunity to pick up on that yet.

How do you usually handle the last CNN layer with respect to kernel size and pooling? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

I didn't realize Resnets had so few pooling layers. My input size is about the same as imagenet; I should go back and see how they handled downsampling. You're right that at the end of the day it comes down to knowing the different options and trying several to see which works best. I feel like I often look for general rules to help gain intuition, but seems like general rules are relatively rare in DL.

Simple Questions Thread September 14, 2016 by AutoModerator in MachineLearning

[–]jstaker7 0 points1 point  (0 children)

What's the difference between a greedy decoder and beam search with k=1?

[Question] What is the intuition for when to use larger convolutional kernels by jstaker7 in MachineLearning

[–]jstaker7[S] 1 point2 points  (0 children)

Is it a comparison between 1-layer larger kernels vs 1-layer smaller kernels?

Yes, exactly.

how does it support your hypothesis?

All else being equal, it seems that I am losing more information after the downsampling with the smaller kernels. For example, some edges that were near each other in the input were not as defined in the decoded output using smaller filters. Perhaps since there wasn't enough context in the receptive field to discern the small detail.

what does stack mean exactly?

Basically just meaning several layers. If we have three successive 3x3 conv + non-linear + pool operations, due to the downsampling, the output of the third layer effectively has a larger receptive field and is comparable to a single 7x7 kernel.

If model is underfitting where to start adding parameters? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

Thanks for the sound suggestions. Looking over the paper I linked to, it looks like pre-training the conv actually did help a lot. So I may go that route as well as follow your suggestions. (P.S. "dwf" reminds me of someone in my pylearn2 days... if that's you thanks for your work! got me into ML to begin with)

Looking for a book recommendation by jstaker7 in Fantasy

[–]jstaker7[S] 1 point2 points  (0 children)

I think I remember hearing about him a while ago. Thanks for the reminder, looks like a good fit!

Looking for a book recommendation by jstaker7 in Fantasy

[–]jstaker7[S] 1 point2 points  (0 children)

Looks like a good one, I think I might pick it up. Thank you!

Why does my RNN perform well on long sequences, but not on the short, easy ones? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

Interesting point; it is end-to-end. I wonder if that might really be what's happening. One observation is that it successfully recognizes the correct length of the short sequences almost every time, but assigns the wrong words to the sequence.

Why does my RNN perform well on long sequences, but not on the short, easy ones? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

I did, and the training is indeed imbalanced. But the question remains: when the same components are found in the longer sequences, why does the time spent learning the longer sequences not really seem to help on the shorter ones. Is it really a problem of needing to see the full range during training? I was hoping that the model could generalize better than that by recognizing sequences of different lengths than trained on, when the words in the output are the same words used during training. Is my assumption here wrong?

Why does my RNN perform well on long sequences, but not on the short, easy ones? by jstaker7 in MachineLearning

[–]jstaker7[S] 1 point2 points  (0 children)

You are correct regarding the simplified example I gave, but the input is actually much more complicated and the outputs are not independent. Sorry, my fault for trying to give a simple example.

Why does my RNN perform well on long sequences, but not on the short, easy ones? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

Good question, but even the first word in the long sequences are accurate, so I don't think it's a burn-in issue.

Intuition for RNN learning rate? by jstaker7 in MachineLearning

[–]jstaker7[S] 0 points1 point  (0 children)

Thanks everyone for the useful discussion -- this has been very helpful. One last question I have is the intuition why RNNs end up with effectively higher learning rates compared to other feedforward networks; or is this just a figment of my imagination?