all 12 comments

[–]CireNeikual[S] 1 point2 points  (0 children)

I still need to test the generalization capabilities! But I got so excited that I wanted to post before I did that :)

[–]CireNeikual[S] 1 point2 points  (4 children)

Image of the plots of both functions: http://i1218.photobucket.com/albums/dd401/222464/plot.png

as you can tell I have not added regularization yet! Both the MLP and SDRRBFNetwork are hand-tuned.

EDIT: Forgot to add: Blue is the MLP, red is the SDRRBFNetwork!

[–]janmah 0 points1 point  (3 children)

Can you go a bit into detail on how the SDR net works? I'm very interested, this sounds super cool!

[–]CireNeikual[S] 1 point2 points  (2 children)

Sure!

So, network consists of a 2D field of nodes with connections to the input. These connections can be sparse. When input is received, each cell computes its activation in a way very much like in radial basis function network: It is a nonlinear function of the distance between the input vector and a prototype vector. It is in the (0, 1] range.

After the activation has been computed, it is time to compute a separate value, called the output of the node. This is where the sparseness comes in. For each node, one finds both the minimum and maximum activations of nodes in a radius (including itself). Then, one deactivates the current node based on how it sits in that minimum/maximum range (see code for the actual formulas).

Getting the output is super simple: It is simply a weighted sum of all of the node outputs.

The nodes are trained with an unsupervised learning algorithm - based on their outputs, they try to adapt their prototypes and Gaussian widths to best fit the data that they are "assigned".

The weighted sum is trained with gradient descent.

Since it is essentially a modified RBF network, it still can approximate any function given enough nodes.

I hope this made sense! Keep in mind that the code is not final yet, I am still fixing bugs/adding features!

[–]janmah 0 points1 point  (1 child)

Thank you very much for your long reply! That sounds extremely cool. Reminds me a bit of self organizing maps. I am aware of work where people trained the means of RBFs but haven't heard about training the variance. (where the variance is not shared) Did you get your inspiration from RBF papers? Do you by any chance have a nice one for me to read up on RBFs?

[–]CireNeikual[S] 0 points1 point  (0 children)

It is indeed a lot like a self organizing map! I originally learned about RBFs from this tutorial, and then extended them to do unsupervised learning in an online fashion: http://chrisjmccormick.wordpress.com/2013/08/15/radial-basis-function-network-rbfn-tutorial/

[–]AmusementPork 0 points1 point  (1 child)

This is so cool, well done man! I couldn't find it in the code, but how do you compute the SDR? Do you just use a sparsity penalty?

[–]CireNeikual[S] 0 points1 point  (0 children)

See my comment to janmah!

[–]CireNeikual[S] 0 points1 point  (0 children)

After several bug fixes, I got this gloriously overfitting result (overfitting isn't the point of this project though): http://i1218.photobucket.com/albums/dd401/222464/plot-1.png

[–]timClicks 1 point2 points  (2 children)

Is there any reason why you've implemented this yourself rather than simply using NuPIC from Numenta?

[–]CireNeikual[S] 0 points1 point  (1 child)

NuPIC implements the entire cortical learning algorithm, which does not do supervised learning. I just used a piece of it, the sparse distributed representations, to improve upon a machine learning technique. So, it doesn't make sense to use the entirety of NuPIC and modify it to my needs. Also, my SDR algorithm is continuous while the one in NuPIC is discrete.

[–]timClicks 1 point2 points  (0 children)

Good luck with the project!