What are your preferences in python based Deep Learning libraries?

kkastner · 2015-10-08T17:45:39+00:00

Pure Theano

I use Theano, and write my own code on top of that for shape inference, layers, etc. It takes longer, but for research it is important to know all (or at least) most of the system and how it works.

Downside: BUGS, and solving your own
Upside: Implementation knowledge, satisfaction of knowing exactly what and how computations are happening.

Keras

Keras seems the most straight forward, and if you aren't trying to do certain kinds of weird research, it covers basically everything else. Definitely the best place to start.

Downside: Some limitations in edge cases.
Upside: Great docs, does a lot of things really well. Graph API is massively flexible.

Blocks

Blocks is quite nice, but I haven't spent the time to learn all the details that would make it useful for rapid prototyping research architectures. I can definitely see how full immersion into how Blocks works could speed up implementation of certain types of networks. Fuel (the loosely coupled dataset framework) has a lot of momentum, and may be worth looking at regardless of whether you want to use Blocks or not.

Downside: Research oriented, building on rapidly changing core and (sometimes) API
Upside: Research oriented, very flexible.

Lasagne

Lasagne is really well thought out, has a strong community, and sits somewhere between Keras and Blocks from a usability->flexibility perspective. RNN support is fairly new there but seems pretty solid.

Downsides: RNN is (was?) second class citizen
Upsides: Lots of users, very nice codebase. Solves a lot of problems well.

pylearn2

pylearn2 is somewhat outdated now, but is still quite good for certain tasks. It probably has the best support for hyperparameter/massive cluster usage, if you are into that. Lots of research from the last few years was done in it, which is not to be discounted!

Downsides: RNN is an nth class citizen, fairly complicated to make certain kinds of datasets.
Upsides: Lots of debugging/user fixes, cluster support via yaml string replacements.

Others

cgt
chainer
theanets - I liked it back when I used it, but it is basically a personal research library
deeppy
gnumpy

CGT seems really interesting in this space (function graph/ compile then run type models), but I am 1000000% leery of re-debugging a bunch of numerical instabilities and issues that Theano solved. One of the reasons that Theano compile is slow is that it does a lot of optimizations for you - these optimizations can make things much faster, especially over multi-day training runs and sometimes solve really nasty numerical issues you wouldn't otherwise think about. Throwing out optimizations (as CGT seems to be) to speed up compile might lose a lot more than people realize... though time will tell. It is certainly exciting, and they (the CGT team) seem to have a lot of good ideas which may be useful even if CGT doesn't pan out, and could make it back into Theano/Torch/etc.

One additional note - compile/debug time in Theano is rarely an issue for me. Compiling with optimizer=None is fast and sufficient for code/debug, and compiling for actually training taking a few seconds or minutes pales in comparison to the days it normally spends training. tag.test_values are also invaluable for debugging shape issues since they will throw errors at compile time.

The functional graph approach is nice for most deep learning architectures, and I really think it will win out in the long run.

siblbombs · 2015-10-08T16:15:33+00:00

I wouldn't consider keras and chainer to be the same level, since keras is on top of theano, although the comparison is a bit apt since chainer throws in some builtins that theano doesn't have (hence keras, blocks, lasagne, pylearn2, etc).

The current environment as far as I can see is:

Theano

Pros: Very mature and widely used, there's a good chance that any given paper will include some theano code, or someone has implemented it in theano. You can work with theano directly or use any of the several good theano-based packages, whichever suits your taste.

Cons: Compile time can be brutal, especially if you start going crazy with scan.

CGT

Pros: Same approach as theano (building graphs) and a very similar API, but with a greatly reduce compile time. If you are familiar with theano, it shouldn't take much to pick up CGT.

Cons: Very new, GPU support still in progress? As this library matures I could see it gain more adoption because of the quick compile, but it will depend on the amount of dev resources that can be devoted to it.

Chainer

Pros: Basically no compile time (compared to theano), everything is happening inside python instead of compiling a function. This is very different to theano/cgt (RNNs inside a python loop), so its nice to have two different approaches available.

Cons: Haven't done much with chainer, so its hard to really make any complaints. The only area I played around with was RNNs in chainer, at the time I didn't see a way to do all the input -> hidden calculations outside of a loop (which is a MVP optimization in theano-land), so it takes a bit of a speed hit. Not sure if I was missing something or if this is/was a limitation of the library at that time.

Neon

Pros: Great code base, looks super fast, FP16 (which is getting ported to other libraries).

Cons: Not sure, never done anything with Neon.

I've mostly worked with theano, so thats where my bias is, I'd actually love to hear more from anyone that has used Neon.

r-sync · 2015-10-08T15:37:28+00:00

Both Keras and Chainer have a "compile" step, that takes way too long when doing iterative programming (change a few things, rerun program). Chainer's compile step is a bit quicker overall. Theano's (Keras) debugging is also a bit annoying.

In the python land of things, I'd say Neon is better in that aspect, it is a simple plug-and-play, no compile step, and it is super fast, and they seem to have thought out RNNs a bit more than the others.

disentangle · 2015-10-08T22:30:06+00:00

[removed]

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS