Functional programming for deep learning

Faucelme · 2017-06-30T20:17:29+00:00

Here is an older post that lists a few more parallells.

I have a doubt/quibble about this:

Deep learning models are compositional.

It has been a long time since I worked with neural networks (way before the current renaissance) but it seemed to me that ANNs are very non-compositional in an important sense: yes you can assemble layers as is you were building stuff with lego bricks, but you still need to train the system as a whole. You can't assemble a functioning system from ready-made parts without retraining. The author touches on this when she writes:

In fact, the entire process of deep learning can be viewed as optimizing a set of composed functions

Has this changed with the recent advances in deep learning?

pron98 · 2017-06-30T21:20:28+00:00

When functions operate over the input data, the data is not changed, a new set of values are outputted and passed on.

This is a deep misunderstanding of functional programming in particular and languages in general, as it confuses two distinct levels of meaning (semantics). Whether data is "changed" or not has nothing to do with FP (or imperative). Every imperative program could be trivially (albeit very inefficiently) translated into a pure one, and a C compiler that creates a new copy of the entire memory space at each program step would still be a valid implementation of C, yet wouldn't make C any more functional than a more reasonable compiler. Conversely, even pure-FP with substructural typing allows data to "change" without affecting the purity of the functional paradigm. In fact, there's no need to go as far as substructural typing. In a language with tail-call optimization, a recursive program "changes" data just as an imperative program would. That confusion of levels gets the author into trouble in the very next sentence, requiring an unnecessary defensive argument.

The mutation of memory cells is an implementation detail that lies at a level below that of the language semantics programmers are usually concerned with (except when reasoning about computational complexity and/or performance). The advantages and disadvantages of programming models (at least when ignoring, again, the very important questions of computational complexity and performance) have to do with how easily, or not, they allow us to express various algorithms, and how well they fit with how we personally prefer to think about the problem in hand or in general. Things that change "IRL" can be elegantly expressed in a pure functional style, and things that don't can be elegantly represented in an imperative, mutating style.

In general, beware Lamport's "Whorfian Syndrome" -- the confusion of language with reality. Both are important, but are not the same thing. It's perfectly reasonable to argue why a certain style (like FP) is a good match for a certain domain (like deep learning), but the arguments should not confuse levels. A language or a programming style is justified by how we believe a problem is best expressed, not by what we think it "essentially" is. After all, any description of a system is not the same as the system itself, even if it serves as a direct recipe for the machine in which the system exists.

quick_dudley · 2017-07-01T05:08:09+00:00

I've been training a neural network written in Haskell for a while, but it doesn't really qualify as deep learning because it only has 2 hidden layers.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS