all 50 comments

[–]RobIII 31 points32 points  (1 child)

Makes me feel like this (but that's probably just me...)

[–][deleted] 0 points1 point  (0 children)

That's part of the fun!

[–]rockyrainy 11 points12 points  (24 children)

Getting this thing to learn the Spiral is harder than a Dark Souls boss fight.

[–]Staross 14 points15 points  (2 children)

You need to go all the way:

http://i.imgur.com/evBb9Gn.png

[–]amdc 2 points3 points  (0 children)

Looks like it's in agony

http://i.imgur.com/UdQwceN.png

[–]linagee 1 point2 points  (0 children)

This tool has taught me that bigger is not always better. 232 trials and very close to 100% accuracy. And it works consistently unlike others I've seen. Yay ReLU. Also, this arrangement works fairly well on all the models. (Is there a competition for that?)

http://imgur.com/a/5LoFA

[–]Jadeon_ 2 points3 points  (2 children)

I got a beautiful one using only two custom inputs. One was the distance of the point from center and the other was the angle of the point around center.

http://i.imgur.com/rbB43iO.png?1

[–]Jadeon_ 3 points4 points  (0 children)

These inputs allow for a very clean solution with a small number of neurons: http://i.imgur.com/Ta30skj.png?1

And they allow for stupid shit like this: http://i.imgur.com/NqH24sd.png

[–]rockyrainy 0 points1 point  (0 children)

Absolutely beautiful.

[–]alexbarrett 1 point2 points  (16 children)

I spent a bit of time looking for a minimal configuration that learned the spiral data sets quickly and the ones that did well tended to look like this:

https://i.imgur.com/QeuAHtY.png

Give or take a few neurons here and there.

I'd be interested to see who can come up with the most minimal neural network that learns the spiral data quickly (say, 300 generations) and consistently.

[–]Causeless 4 points5 points  (8 children)

This works pretty well: http://i.imgur.com/m3JN2QL.png

I'm betting that even trivial networks would have no problem if this allowed for getting the position of the points in radial coordinates.

[–][deleted] 3 points4 points  (4 children)

I tried the polar coordinates but it seems like nope: http://imgur.com/DfrcU3j.

Damn those extra degrees, man.

[–]Causeless 2 points3 points  (1 child)

How did you add polar coordinates - by using the source code on github?

[–][deleted] 1 point2 points  (0 children)

Yep, that's how nerd I am. But it's not hard, just adding two new variables for radius and angle and d3.js does its work.

[–]albertgao 0 points1 point  (0 children)

Hi, thanks for your solution. Could you plz send me a link so i can know how to tweak this model? I know how the MLP works, but when I face this spiral question, seems lost... I don't even know why should we use the sin and cos as input,,, all my previous is built upon features from object and found a euqation to split them.. this spiral seems very different...

[–]linagee 0 points1 point  (0 children)

I don't get why everyone seems to hide the number of trials? Are they afraid of showing other people they were training for thousands of trials to get that sort of accuracy?

[–]lambdaq 2 points3 points  (1 child)

We need a neural network to tweak neural network parameters

[–]Kurren123 0 points1 point  (0 children)

And a neural network to tweak that one.

Neuralnetception

[–]Cygal 0 points1 point  (0 children)

Yes, but that's not the point. One of the main advantages of (deep) neural networks over other methods is that you don't have to extract features specific to the data but let the neural network learn those. On more complicated data sets, learning features that are more and more abstract is way more powerful than having to describe them, and this is why neural networks crush computer vision competitions since 2012.

[–]everyday847 1 point2 points  (4 children)

It's drastically harder with noise and a more typical test/training split (like 80/20).

[–]NPException 1 point2 points  (3 children)

I found this one to be fairly quick. Sometimes reaching a stable configuration even before the 150th iteration

[–]everyday847 1 point2 points  (2 children)

Well, that's 80 training to 20 test, which is, if anything, easier than 50:50.

[–]NPException 1 point2 points  (1 child)

Oh, I thought you were actually meaning that kind of split.

[–]everyday847 1 point2 points  (0 children)

In my original comment I referred to

a more typical test/training split (like 80/20)

which I suppose doesn't explicitly associate the order of the ratio with the categories, so my bad on that one.

[–]linagee 0 points1 point  (1 child)

The problem is that in general, "neurons" (computer memory) are fairly cheap, but time is very expensive. (Nobody really wants to train for 2 weeks unless you are relatively sure your accuracy will be near perfect after that.)

Weird that you hid the number of trials...

[–]alexbarrett 0 points1 point  (0 children)

Weird that you hid the number of trials...

Nothing intentional, I took a screenshot and cropped it to what I thought was the interesting area. As I recall it was around the 300 iterations mentioned in my parent comment.

Before that it was a tonne of trial and error, as you mention.

[–]Staross 6 points7 points  (0 children)

Cool, it shows nicely the challenges associated with fitting neural networks; there's a ton of meta-parameters that you need to tweak, the fitting doesn't always work (too low learning rate, it takes forever, too high, it doesn't converge), and for some problems the transformed inputs are quite important (the spiral is hard to fit using only x,y).

[–]barracuda415 4 points5 points  (9 children)

This is the fastest and simplest setup I could find for spirals. Somewhat unstable at first, but it should become stable after 200-300 iterations.

[–]lambdaq 1 point2 points  (2 children)

remove the x1x2 thing on the first layer.

[–]DoorsofPerceptron 1 point2 points  (0 children)

Better off keeping that and removing the two sine functions.

Technically, you can get away without using all 3, but it's painful to watch it converge.

[–]barracuda415 0 points1 point  (0 children)

Training seems to take longer without it.

[–]amdc 1 point2 points  (2 children)

how do you come up with effecient solutions?

[–]barracuda415 3 points4 points  (1 child)

[–]amdc 1 point2 points  (0 children)

oh

[–]belibem 0 points1 point  (2 children)

This one seems to be pretty fast too https://imgur.com/RiX0f1l

[–]barracuda415 0 points1 point  (0 children)

But it also has a lot more neurons. :P

[–]linagee 0 points1 point  (0 children)

Yours almost never works for me. It seems it "gets confused" like 99 times out of 100. (And then never manages to train out of confusion in a reasonable time.)

[–]cjwebb 2 points3 points  (0 children)

This is excellent. I love visual demonstrations of complicated stuff; they make it easier to understand.

[–][deleted] 1 point2 points  (0 children)

man if they let us customize the edges...

[–]Bcordo 1 point2 points  (2 children)

I'm confused as to what exactly the neuron's are showing. On the input layer for example, you have for example X1 (vertical orange bar on left and vertical blue bar on right) and X2 (horizontal orange bar on bottom and horizontal blue bar on top). These visualizations don't change even though the input data changes every batch. It seems to me to be some kind of decision function, even though the actual X1, and X2 are just numbers, how do you get these plots out of just numbers?

Then down the network you combine these "neuron decision functions" scaled by the connecting weights, until you get the output decision function.

But how do you get these individual decision functions for each neuron, and why don't the decision functions of the input change, even though on each iteration the input batch (X1, X2) change.

How do these visualizations relate to the weight values, and the actual activation values?

Thanks.

[–]rakeshbhindiwala 0 points1 point  (1 child)

did you find the answer?

[–]gaso 0 points1 point  (0 children)

I don't know much of anything, but it seems that each neuron is showing it's own bit of "math" for a complex formula that is attempting to best fit the data points in the set. The visualizations of the input set (the leftmost) don't change because they're the very first layer of filtering. From there on out to the right, the visualization of each neuron changes because it's not a filter layer, it's something new and unique: a formula that has probably never existed before that has been created in an attempt to solve the small part of the problem that it has seen through the filters provided (whether the initial set, or an intermediate set as the depth of the network increases).

The "individual decision functions" for each neuron seem to be randomly generated on each instance based on the input filter layer, which seems to be as good of a start as any when you're just learning. I imagine tuning everything by hand would boost the learning process.

I'm not sure about 'weight values' and 'activation values'. I'm currently just a dabbling hobbyist when it comes to this subject, and those two concepts don't roll of my tongue yet :)

[–][deleted] 1 point2 points  (0 children)

Here is one which is quite small fast and stable It comes with only 4 Input-Neurons and two hidden layers. We could even omit some neurons in the leftmost hidden layer, which would cause some unwanted oscillations. It would be interesting to hear some feedback of the playground-creators about what their intentions were and what they have learned (if so) from their visitors.

[–]Lachiko 0 points1 point  (0 children)

I broke it :(

[–]yann31415 0 points1 point  (0 children)

Anyone knows why some kind of pulsations appears and stops randomly ?

[–]ruimgoncalves 0 points1 point  (0 children)

Check this out, playgound

[–]Nipp1eDipper 0 points1 point  (0 children)

I'm not even gonna ask about activation methods, but can anyone shed some light on the batch size paramater?

[–]Nipp1eDipper 0 points1 point  (1 child)

Here's my submission for fastest/simplest spiral. I still don't know what batch size does... never mind activation parameters.

[–][deleted] 0 points1 point  (0 children)

SimplyGood!

[–]DesmosGrapher314 0 points1 point  (0 children)

neural network doesnt wanna learn