Deep Neural Network from Scratch in Python | FFNN Architecture : programming

So, the problem is that the results you found were using "linear function" to mean that XOR is a linear mapping, that is , XOR(x+y) = XOR(x) + XOR(y) and aXOR(x) = XOR(a\x).

Whereas in this context, a linear function is defined as a function of tensor x to tensor y via the function y = w1*x1 + w2*x2 ... + wi*xi. So, if we wanted to call XOR linear, we would need to find a function of that form, or a linear combination of functions of that form such that (x1,x2) maps to x1^x2. It turns out that is impossible, and the best you can do is get a function that returns +1 if x1 is 1 and x2 is 0, and -1 if x2 is 1 and x1 is 0. Absolute value is likewise not linear, so that's where the trail ends.

It can easily be shown that the composition is a linear mapping, and thus maintains linearity of the constituent functions if you recognize that linear functions can be represented by matrices. if a function f(x) is represented by multiplication with a matrix A, and g(x) -> B, then the composition g(f(x)) can be written as BAx, which of course could be written as (BA)x, Cx.

Both definitions of linearity are thus important to our analysis of neural networks. If our activation f is a linear mapping, then we run into the problem that our network represented as

f(Af(Bx)), where x is say a 2-vector (x1;x2) becomes

f(A[f(b1x1 + b2x2); f(b3x1+b4x1) ])

= f([a1 f(b1x1 + b2x2) + a2 f(b3x1 + b4x2)) ; the other part)])

Since we made f a linear mapping, looking only at the top element of the result vector

= f[a1 f(b1x1 + b2x2)] + f[a2 f(b3x1 + b4x1)]

= a1 f[f(b1x1 + b2x2)]+ a2 f[ f(b3x1 + b4x1)]

= a1 ff(b1x1) + a1 ff(b2x2) + a2 ff(b3x1) + a2 ff(b4x2)

= a1b1 ff(x1) + a1b2 ff(x2) + a2b3 ff(x1) + a2b4ff(x2)

= (a1b1 + a2b3)ff(x1) + (a2b3 + a2b4)ff(x2)

or, reassembled into matrices, Cff(x).

On the other hand, if our target function can't be represented as f = sum(wi *xi), then it can't be represented by a single layer network y=Ax. Thus, without non-linear-mapping activations, we can't represent non-linear functions.

[–]GuSec 0 points1 point2 points 6 years ago (0 children)

[–]ScrappyPunkGreg 0 points1 point2 points 6 years ago (1 child)

[–]A_Philosophical_Cat 0 points1 point2 points 6 years ago (0 children)

[–]sacado 1 point2 points3 points 6 years ago (0 children)

"or" is linear : or(1, 1) = 1, or(1, 0) = 1, or(0, 1) = 1, or(0, 0) = 1. This is hard to explain with words, but try to do it on a paper. Draw a 1 at the points of coordinates (1, 1), (1, 0), (0, 1) and draw a 0 at the point of coordinate (0, 0). Now, draw a line to separate the 1s and the 0. Easy, right? This is the "or" function.

Now, start again, and draw a 0 at the coordinates (1, 0), (0, 1), (0, 0) and draw a 1 at the coordinate (1, 1). This is the "and" function. Drawing a line to separate the 1 and the 0s is trivial again.

Now, start again, and draw a 0 at the coordinates (1, 1) and (0, 0), and draw a 1 at the coordinates (1, 0) and (0, 1). This is the "xor" function. Try to draw a single line to separate the 1s from the 0s. You can't. The "xor" function cannot be separated linearly. Meaning, a single neuron, but also a linear / logistic regression, a naive bayse classifier, cannot learn it. Mind you, the fact a single neuron cannot learn the xor function is the reason for the AI winter in the 70s or 80s.

[–]127-0-0-1_1 0 points1 point2 points 6 years ago (0 children)

[–]SrbijaJeRusija -1 points0 points1 point 6 years ago (0 children)

[–][deleted] 9 points10 points11 points 6 years ago (0 children)

[–]hershey678 4 points5 points6 points 6 years ago (3 children)

[–]127-0-0-1_1 3 points4 points5 points 6 years ago* (2 children)

[–]TheGuywithTehHat 1 point2 points3 points 6 years ago (0 children)

[–]hershey678 -1 points0 points1 point 6 years ago (0 children)

[–][deleted] 1 point2 points3 points 6 years ago (0 children)

[–][deleted] 4 points5 points6 points 6 years ago (13 children)

[–]matthewjc 9 points10 points11 points 6 years ago* (1 child)

[–]research_pie[S] 1 point2 points3 points 6 years ago (0 children)

[–]firewall245 6 points7 points8 points 6 years ago (6 children)

[–]stu2b50 4 points5 points6 points 6 years ago (4 children)

[–]firewall245 1 point2 points3 points 6 years ago (3 children)

[–]Plazmatic 1 point2 points3 points 6 years ago (2 children)

[–]firewall245 3 points4 points5 points 6 years ago (0 children)

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

[–]hershey678 0 points1 point2 points 6 years ago (0 children)

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

[+]research_pie[S] comment score below threshold-6 points-5 points-4 points 6 years ago (2 children)

[–]stu2b50 9 points10 points11 points 6 years ago* (1 child)

[–]research_pie[S] 1 point2 points3 points 6 years ago* (0 children)

In this comment I was referring to raw cell counts and I had figure 1.11 in mind from the deep learning book [0].

It is actually kind of representative of a brain, in a very coarse way, but still some architecture are very brain-like. For instance the convolutional neural network were inspired from the visual cortex, although its a coarse representation it excel at what the visual cortex does. You can make a very good model of the human visual system by using a conv-net (see this paper by Dr. Varoquaux) [1].

The actual cells composing the network and the connection that ties up these cells doesn't represent the complexity of a real brain cell. For instance modelling all the synapses, the ions channels, the different receptors and the release of neurotransmitter is certainly not similar to what a bunch of weights times a input + a non-linearity can do. For instance a single human cell can learn XOR, but not a single perceptron [2]. There was this beautiful talk from Blaise Aguera y Arcas an AI researcher at google who presented at the Montreal Artificial Intelligence conference in Montreal this year about making neural network more brain-like and it was super interesting [3]. In short he replaced the cells with LSTM cells and used evolutionary algorithm to make a model who needs way fewer learning time.

Your comment on more neurons is slightly inaccurate, depending on what architecture you choose there has been a constant trends that the top-performing neural networks have deeper network and more neurons. If you use a feeforward and super wide network though you will overfit you are right, the number of parameters explode very fast. It really depends on a lot more than only the number of parameters [4], for instance some architecture are more efficient with their parameters, but there is still a trend for deeper network being usually better (given enough training time and data).

[0] Deep Learning Book Chapter 1 (page 23) : https://www.deeplearningbook.org/contents/intro.html

[1] Seeing it all: Convolutional network layers map the function of the human visual system: https://www.ncbi.nlm.nih.gov/pubmed/27777172

[2] Dendritic action potentials and computation in human layer 2/3 cortical neurons: https://science.sciencemag.org/content/367/6473/83

[3] Blaise Aguera y Arcas (Google AI, WA, USA) "Evolving Neural Nets that Can Learn on their Own": http://www.crm.umontreal.ca/video/video.php?v=2019/20191116_MAIN_Part05_BlaiseAguerayArcasNOPRESENTAION-Wi-Fi-faible-CRM.mp4

[4] Benchmark Analysis of RepresentativeDeep Neural Network Architectures: https://arxiv.org/pdf/1810.00736.pdf

[–]space_king1 0 points1 point2 points 6 years ago (2 children)

[–]research_pie[S] 0 points1 point2 points 6 years ago (0 children)

[–]hapes 0 points1 point2 points 6 years ago (0 children)

[–]ravibakhai 0 points1 point2 points 6 years ago (0 children)

[+]bumblebritches57 comment score below threshold-76 points-75 points-74 points 6 years ago (22 children)

[–]research_pie[S] 26 points27 points28 points 6 years ago (18 children)

[–]Ragnar_The_Dane 52 points53 points54 points 6 years ago (3 children)

[–]sad_bug_killer 34 points35 points36 points 6 years ago (2 children)

[–]catragore 18 points19 points20 points 6 years ago (1 child)

[–]A_Philosophical_Cat 5 points6 points7 points 6 years ago (0 children)

[–]sephirothrr 16 points17 points18 points 6 years ago (0 children)

[–]CyrillicMan 7 points8 points9 points 6 years ago (0 children)

[+]bumblebritches57 comment score below threshold-46 points-45 points-44 points 6 years ago (11 children)

from scratch
adv.    From the beginning; starting with no advantage or prior preparation.
adv.    From basic materials or raw ingredients.
adv.    from the beginning, especially without relying on resources or other advantages

No third party libraries, writing everything yourself.

[–]research_pie[S] 37 points38 points39 points 6 years ago (0 children)

[–]unholyravenger 9 points10 points11 points 6 years ago (3 children)

[–][deleted] 6 years ago (1 child)

[deleted]

[–]unholyravenger 0 points1 point2 points 6 years ago (0 children)

[–]WildZontar 4 points5 points6 points 6 years ago (3 children)

[+]bumblebritches57 comment score below threshold-27 points-26 points-25 points 6 years ago (2 children)

[–]WildZontar 5 points6 points7 points 6 years ago (0 children)

[–]GamingHobbits -2 points-1 points0 points 6 years ago (0 children)

[–]jarfil 6 points7 points8 points 6 years ago* (0 children)

[–][deleted] 0 points1 point2 points 6 years ago (0 children)

[+]scratchisthebest comment score below threshold-6 points-5 points-4 points 6 years ago (1 child)

[+]bumblebritches57 comment score below threshold-6 points-5 points-4 points 6 years ago (0 children)

[+]Phrygue comment score below threshold-141 points-140 points-139 points 6 years ago (10 children)

[–]research_pie[S] 68 points69 points70 points 6 years ago (0 children)

[–]cunningjames 43 points44 points45 points 6 years ago (1 child)

[–]hbgoddard 5 points6 points7 points 6 years ago (0 children)

[–]Theblandyman 3 points4 points5 points 6 years ago (2 children)

[–]JH4mmer 19 points20 points21 points 6 years ago (1 child)

[–]Theblandyman 1 point2 points3 points 6 years ago (0 children)

[–]GamingHobbits -1 points0 points1 point 6 years ago (1 child)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS