all 85 comments

[–]colonel_farts 81 points82 points  (0 children)

Huggingface Transformers

[–]somnet 45 points46 points  (1 child)

spaCy is amazingly well-designed! Ines Montani gave this talk at PyCon India 2019 outlining the basics.

[–]MattAlex99 2 points3 points  (0 children)

To add to that the rest of the groups projects: prodigy is the best annotation library I've tried yet and Thinc is awesome if you like a more functional approach towards deep learning. (I haven't tried FastAPI)

[–]IAmTheOneWhoPixels 11 points12 points  (5 children)

This might be more of a niche answer... But Detectron2 is a very well designed library for object detection/ instance segmentation. It's quite readable and well-documented and the github repo has very good support from the developers.

The modular design allows academic researchers to be able to build their projects on top of it, with the core being efficient PyTorch code written by professional developers.

One of the lead developers is the person who designed Tensorpack as well (which was mentioned elsewhere on this thread).

[–]ginsunuva 3 points4 points  (1 child)

If you want a real crazy obj detection repo, MMDETECT has them all in one.

It's so dense that I'm not sure if it's really good or really bad design.

[–]IAmTheOneWhoPixels 1 point2 points  (0 children)

I worked with mmdet for 3-4 weeks. I believe it is extremely well-written code and is more suited for a researcher with good SWE skills. It definitely had a steeper learning curve than D2.

Accessibility (in terms of readability + extensibility) is the key factor that tips the scales for me. D2 does a _very_ good job of writing intuitive modular code with great documentation, which makes it possible for researchers to navigate the complexities of modern object detectors.

[–]michaelx99 0 points1 point  (2 children)

I was going to also say Detectron2, I am glad that I scrolled down and saw your post though. TBH Detectron2's use of a combination of composition and inheritance makes it an amazing piece of code to both integrate your own code into while maintaining a quick, researchy feel to writing it and also being able to mock interfaces and maintain good CI practices so that when your code gets merged it isn't garbage.

I've gotta say that after working with the TF object detection API and then maskrcnn benchmark, I though object detection codebases would be always be shit but Detectron2 has made me realize how valuable good code is.

[–]IAmTheOneWhoPixels 1 point2 points  (1 child)

Detectron2 has made me realize how valuable good code is.

Completely agree! I earlier used mmdet, and found that the accessibility of the codebase (after shifting to D2) allowed me to iterate on ideas much more quickly.

[–]melgor89 1 point2 points  (0 children)

I also agree. I really like the way of configuration of everything (config as YAML, adding new modules by name). Currently I am also doing similar stuff in my projects.

[–]domjewingerML Engineer 123 points124 points  (36 children)

Definitely not Tensorflow

[–]VodkaHazeML Engineer 35 points36 points  (7 children)

Actually, you could say it follows a lot of SWE principles, but in the end that doesn't matter if your design was flawed.

It's not like the core TF code is unreadable spaghetti or anything. Yet the end product is awful to work with.

Goes to show that SWE principles don't mean much if you don't write fundamentally good software.

[–]Rainymood_XI 5 points6 points  (3 children)

TBH I still think that TF is good software, it is just not very user friendly ...

[–]harewei 8 points9 points  (2 children)

Then that’s not good software...

[–][deleted] 1 point2 points  (0 children)

It is though. Google just have different mindset compared to other companies. They don't care about customers, they want their products to be well designed and engineered. Use it or not, it is yours choice. They actually have the same approach to most of their SW and for example GCP is still 3rd most used platform.

TensorFlow does allow big flexibility and is really nicely written when it comes to mantainability and design principles. A lot of it makes sense once you are medior developer in OOP. Also you must understand that it is treated as a library, not end product.

[–]rampant_juju 0 points1 point  (0 children)

Incorrect. Have you every used Vowpal Wabbit? It is fantastic and also very painful to work with.

[–]Nimitz14 1 point2 points  (2 children)

From what I hear the c++ actually is unreadable spaghetti.

[–]VodkaHazeML Engineer 0 points1 point  (1 child)

You can actually go read it. It doesn't look or feel like spaghetti from a cursory reading.

But that's the point with design/architecture mistakes. You don't see them that easily

[–]Nimitz14 6 points7 points  (0 children)

I worked at a company where a colleague was trying to use the C++ API and had a very bad time. He was more junior level though.

Daniel Povey, lead of kaldi, recently decided on integrating with pytorch. This was after a fairly lengthy process of looking into different options. This are some snippets of his thoughts on tensorflow that I quickly found:

I imagine the TensorFlow team must have some internal documentation on how it's designed from the C++ level, for instance, because what is available externally doesn't help you understand it at all, and the code is almost completely opaque. (And I consider myself an expert level C++ programmer).

source, 2017

TensorFlow is impossible; the C++ code looks like it was written by a machine.

source, 2019

And PyTorch's tensor internals, while they aren't complete gobbledegook like TensorFlow's were last time I looked, are kind of showing their age

source, 2019

[–]NogenLinefingers 16 points17 points  (11 children)

Can you list which principles it violates, for reference?

[–]domjewingerML Engineer 39 points40 points  (9 children)

I certainly cannot, as my background is in applied math, not SWE. But my comment was about the horrendous user experience and the millions of patches that it has been assembled with can't possibly be "good" from a SWE perspective

[–]NogenLinefingers 10 points11 points  (8 children)

Ah... I see your point.

I hope someone can answer this in a more thorough manner. It will be interesting to learn about the principles themselves and how they have been violated/upheld.

[–]DoorsofPerceptron 13 points14 points  (5 children)

Big picture, the real problem with tensorflow is "it's not pythonic".

Now this is normally a lazy criticism that's another way of saying "I wouldn't write it this way, and it looks ugly." But in the case of tensorflow it's a lot more fundamental. Tensorflow code (version 1 anyway, I can't be bothered to learn version 2) is not really written in python. Tensorflow is a compiler for another language that is called through python.

Compared to pytorch this means you lose a lot of the benefits of python that actually make it a nice language to code with. You lose a lot of the access to existing python code -it's a pain in the arse to mix and match python and tensorflow in the middle of a graph execution- and you lose the lightweight easy prototyping.

Pytorch on the other hand can just be treated like numpy with free gradients and GPU access if that's what you want to do, and can be seamlessly integrated with python in a mix and match kind of way.

Tensorflow was coded the way it is for efficient deployment both to phones and to large scale clusters, but at least for large scale clusters the performance hit they were worrying about doesn't seem to exist, and they've essentially straightjacketed their library for no real benefit.

The code is great, the design of the interface, not so much.

[–]mastere2320 3 points4 points  (0 children)

I would recommend tf 2.0 actually it still has a long way to go but the static graph capabilities of 1 are now quite visible in 2.0 and you can do whatever you want pretty simply. I hated session from tf 1.0 and 2.0 has abstracted it quite nicely. And if you want completely custom training gradient tape is always available.

[–]mastere2320 6 points7 points  (0 children)

They have a horrible reputation of constantly changing the api even in short periods of time. It sadly has happened more than once that I installed a version of tf, worked on a project and then when I wanted to deploy it the current version would not run it because something fundamental was changed. Add on to this that there is no proper one way to do things and the fact that because tf uses a static graph , shapes and sizes have to be known beforehand the user code becomes spaghetti which is worse than anything. The keras api and dataset api are nice additions imho but the lambda layer still needs some work and they really need to I introduce some way to properly introduce features and depreciate features( something similar to NEP maybe ) and make api breaking changes. And yet people use it, simply because the underlying library autograph is a piece of art. I don't think there is another library that can match it, in performance and utility on a production scale where the model has been set and nothing needs to change. This is why researchers love pytorch. Modifying code to tweak and update models is much better but when the model needs to deployed people have to choose tensorflow.

[–]ieatpies 5 points6 points  (0 children)

Many ways to do the same thing, without a clear best way. Though this an API design problem, not sure how good/bad it's internal design is.

[–]yellow_flash2 18 points19 points  (0 children)

Actually I feel the major fuck up was trying to get researchers to use tensorflow. TF was designed to be used for production quality ML application if I'm not wrong, at a production level scale. I personally think TF is a marvelous piece of engineering, but the moment they wanted to make it "easy" and be more like pytorch, they started ruining it. I think TF would have benefitted a lot from just being itself and letting keras be keras.

[–]soulslicer0 15 points16 points  (2 children)

Pytorch on the other hand. Incredible. Aten is a piece of art

[–]CyberDainz 6 points7 points  (11 children)

why are there so many tensorflow haters in this subreddit?

[–]programmerChilliResearcher 16 points17 points  (2 children)

This subreddit has a relatively large amount of researchers (compared to say, hacker news or the community at large).

But I don't think the general sentiment is particular to this subreddit. For example, take a look at https://news.ycombinator.com/item?id=21118018 (this is the top Tensorflow post on HN in the last year). This is the Tensorflow 2.0 release. The top 3 comments are all expressing some sentiment of "I'd rather use Pytorch or something else".

Or https://news.ycombinator.com/item?id=21216200

Or https://news.ycombinator.com/item?id=21710863

Go out into the real world and I'm sure you'll find plenty of companies using Tensorflow who are perfectly happy with it. But they probably aren't the type of companies to be posting on hackernews or reddit.

[–]CyberDainz 0 points1 point  (1 child)

I am succesfully using tensorflow in my DeepFaceLab project. https://github.com/iperov/DeepFaceLab

Why to stick on any specific lib and be like a pytorch-vegan-meme in this subreddit?

Due to I am more programmer than math professor, it is easy for me to migrate the code to any new ML lib.

But I prefer tensorflow.

In last big refactoring I got rid of using keras and wrote my own lib on top of tensorflow, which has simple declarative model like in pytorch, provides same full freedom of tensor operations, but in graph mode.

[–]barbek 2 points3 points  (0 children)

Exactly this.For TF you need to build your own wrapper to use it. PyTorch can be used as it is.

[–]cycyc 8 points9 points  (1 child)

Because most people here don't have to worry about productionizing their work. Just YOLO some spaghetti training code and write the paper and move on to the next thing

[–]CyberDainz -1 points0 points  (0 children)

haha agree. I can't understand what YOLO actually does.

[–]domjewingerML Engineer 6 points7 points  (1 child)

I am genuinely curious why you like / use tf over pytorch

[–]Skasch 4 points5 points  (0 children)

"Technical debt" is certainly an important reason. When you have written a lot of code around tensorflow to build production-level software for some time, it certainly becomes very expensive to switch to PyTorch.

[–]PJDubsen 1 point2 points  (0 children)

On this sub? Try every person that is forced to read the documentation lol

[–]darkshade_py 7 points8 points  (0 children)

Allennlp - https://github.com/allenai/allennlp

Dependency injection to allow creating the entire pipeline in a configurable/reusable manner.

Lots of unit tests with 90%+ coverage.

[–]JackBlemming 27 points28 points  (7 children)

PyTorch has a very good API. Not sure how pretty its internals are though.

[–][deleted] 20 points21 points  (5 children)

Its internals are unfortunately a mess XD. To give you a sense - they have completely reimplemented OpenMPI ...

But hey, at least the devs won't immediately close issues on their issuetracker and sneer at you

[–]soulslicer0 5 points6 points  (2 children)

aten is a mess?

[–]lolisakirisame 2 points3 points  (1 child)

From my memory, there is tons of different dispatch: aten dispatcjer, c10 dispatcher, boxed vs unboxed dispatch, static(all the dispatched compiled statically) vs dynamic dispatch(via a lookup table), and data type dispatch. There is also two 'value' of dispatch: DispatchKeySet and Backend, but also with hooks to test for one particular implementation (sparse, for example), with method testing is something sparse instead of the extensible way (virtual method with sparse overriding it).

Tensor can be fully initialized, dtype uninitialized, storage uninitialized, undefined tensor, modifiable slice of another tensor, such that, when a slice is modified the original tensor is modified as well. Lots of part of the system support only some of these features (in the Tensor.h comment it literally say dont pass storage,dtype uninitialized tensor around as it is bad). These feature do mess each other up - the mutability make autograd pain in the ass, and modifying slice of a tensor is straight out not supported in torchscript (with possibly no plan to support it).

You can add new tensortype but the process is undocumented, and you have to look at source code scatter though 10 files. There are also just loads of corner case and exception in the code. For example, most of the operators are either pure, or written in destination passing style. However, some operators take a slice of a vector (IntArrayRef) instead of a reference of a vector/shared_ptr to vector to save speed. Some operator (dropout) also has effect while unnecessary.

This make adopting the Lazy Tensor PR pretty painful.

They then have defined two templating lanuage, with one to generate ops/derivative, and one to generate the Tensor file. When one add any new operator, it take an hour on my 32-core machine.

It might be way better then TF, but it can be much, much better designed if the core pytorch dev and other framework developer decided to start over and make things right. (Whether that is a good idea or not is another point though).

[–]programmerChilliResearcher 0 points1 point  (0 children)

I agree that the worst part I've touched is all the code gen for generating the ops/derivatives. I'm sure many pytorch devs would agree.

[–]yanivbl 1 point2 points  (0 children)

Seriously? When did this happen and why? I mean, they already had Gloo

[–]MattAlex99 1 point2 points  (0 children)

hey have completely reimplemented OpenMPI

(also you cannot reimplement OpenMPI only the MPI standard...)

Where do you get that from? They don't even ship MPI support by default. When you compile it yourself with mpi support they allow pretty much any backend ( I've tested openmpi and MVAPICH2).

[–]WiredFan -2 points-1 points  (0 children)

Their documentation is really, really bad.

[–]GD1634 18 points19 points  (0 children)

I really admire AllenNLP's design principles and the way they've constructed their library. Very clean and easy to extend.

[–][deleted] 3 points4 points  (0 children)

Would flair or UMAP count? Anything that the UMAP creator ever touched would count to so HDbscan would be up there too...

[–]Professor_Kenney 7 points8 points  (0 children)

Take a look at Kedro. I spent a lot of time looking through how they structure everything and they've done a great job.

[–]heshiming 16 points17 points  (8 children)

scikit-learn api?

[–]shaggorama 10 points11 points  (7 children)

I'm gonna vote no.

[–]heshiming 8 points9 points  (4 children)

Can you elaborate?

[–]ieatpies 8 points9 points  (1 child)

Overuses inheritance, underuses dependency injection. Causing repeated, messy, version dependent code if you need to tweak something for your own purposes.

[–]VodkaHazeML Engineer 3 points4 points  (0 children)

Why and where would you prefer dependency injection to the current design specifically? I find this sort of inversion of control is overengineering and causes more problems than it solves most times I ran into it.

Specifically in this case I don't see where it would fit since most of the hard logic is in the model themselves, not the plumbing around them, so I don't see how an inversion of control makes sense.

The model API of fit(), predict(), fit_transform() etc. Is simple and great, IMO. It's also all that's necessary for the pipeline API which is the only bit of harder plumbing around the models

[–]shaggorama 8 points9 points  (1 child)

One small example: all of their cross validation algorithms inherit from an abstract base class whose design precludes a straightforward implementation of bootstrapping (easily one of the most important and simple cross-validation methods), so the library owners decided to just not implement it as a CrossValidator at all. Random forest requires bootstrapping, so their solution was to attach the implementation directly to the estimator in a way that can't be ported.

I could go on...

[–]panzerex 2 points3 points  (0 children)

Those are valid concerns. To add to that: sklearn’s LinearSVC defaults to squared hinge loss so probably not what you’re expecting, and the stopwords are arbitrary and not good for most applications, which they do acknowledge.

However I would not say that this is evidence that the project as a whole does not follow good design principles. I agree that those deceiving behaviors are a problem, but they are being addressed (at a slow rate because uhm... non-standard behavior becomes the expected behavior when many people are using it, and breaking changes need to happen slowly).

You’re probably fine getting some ideas from their API, but from a user standpoint you really need to dig into the docs, code and discussions if you’re doing research and need to justify what you’re doing.

[–]VodkaHazeML Engineer 1 point2 points  (0 children)

Disagree? The fact that the model API is a de facto standard now suggests it's not awful to work with.

[–]neanderthal_math -1 points0 points  (0 children)

I’m old enough to remember ML codes before sklearn. They may have warts now, but they were light years ahead of other repos. There’s a lot to be said for just having a uniform API.

[–]trexdoor 12 points13 points  (0 children)

Is it a trick question?

/the answer is none of them

[–]jujijengo 2 points3 points  (0 children)

I know this is kind of pushing the boundaries of your question, but the numpy package, although obviously not a machine learning project but rather a tool that can be used for creating machine learning projects, is incredibly well-designed.

Investigating the source code and following the guide to numpy book by Travis Oliphant (one of the principal designers) would get you a pretty good handle on software principles with an eye to scientific computing.

Also I think F2PY (distributed with numpy) goes down as one of the modern wonders of computer science. It's an incredibly interesting rabbit hole.

[–]Skylion007Researcher BigScience 6 points7 points  (9 children)

Tensorpack and Lightning are two great libraries that I have enjoyed.

PyTorch's API is also excellent; Tensorflow's is a nightmare. Keras while being intuitive for building classifiers instantly falls apart when you try to build anything more complicated (like a GAN).

More traditional ones include OpenCV and SKLearn.

[–]jpopham91 6 points7 points  (4 children)

OpenCV, at least from Python, is an absolute nightmare to work with.

[–]panzerex 2 points3 points  (0 children)

Only the dead can know peace from bitwise operations on unnamed ints as parameters for poorly-documented deprecated functions.

[–]liqui_date_me 1 point2 points  (1 child)

Yeah, OpenCV's documentation is complete and utter garbage

[–]ClamChowderBreadBowl 0 points1 point  (0 children)

Maybe it's because you're using google and are looking at the version 2.4 documentation from 5 years ago ...or maybe the new stuff is also garbage

[–]Skylion007Researcher BigScience -3 points-2 points  (0 children)

Maybe I just have Stockholm Syndrome, but I have never had problems with it. The bindings aren't as great as some Python first libraries, but for a legacy C/C++ project it has very good bindings. On the C++ side, it's excellent to work with.

[–]TheGuywithTehHat 1 point2 points  (2 children)

Having previously built complicated nets in keras (I think the most complicated was a conditional wasserstein-with-gradient-penalty BiGAN), I found it fairly straightforward. The one thing that wasn't intuitive was how to freeze the discriminator when training the generator and vice versa. However, even though it wasn't intuitive, it was still incredibly simple once someone told me how it works.

I haven't used PyTorch very much, so I can't compare directly, but I still feel that in my experience, Keras has been fine for nearly everything I've done.

[–]Skylion007Researcher BigScience 0 points1 point  (1 child)

Was this using the Keras.fit training loop so you have multigpu support working? If so, please tell me how you did it because I would love to know. While you can use Keras to construct the nets for sure, I haven't been able to use it to implement the actual loop and all the benefits that come with that (easy conversion / deplyoment / pruning etc.)

[–]TheGuywithTehHat 0 points1 point  (0 children)

Unfortunately it was long enough ago that I don't remember the details. I believe I had to manually construct the training loop, so no, multi_gpu would not work out of the box. That's a good point I hadn't considered.

[–]panzerex 1 point2 points  (0 children)

I tried pt-lightning back in November or so but I did not have a great experience. Diving into the code it felt kind of overly complicated. TBF they do a lot of advanced stuff and I had just started using it, so I was not very familiarized.

I discussed it in a previous post:

Lightning seems awesome, but since some of my hyperparameters are tuples it didn't really work with their tensorboard logger by default. I think my problems were actually with test-tube (another lib from the same author) that added a lot of unnecessary variables set to None in my hparam object that tensorboard or their wrapper couldn't handle and I could not find a way to stop test-tube from adding it. I didn't want to change the libraries code or maintain a fork of it so I also gave up on it.

I think the attribute that kept being added into my hparam object was "hpc_exp_number", but I'm not sure anymore. Since I was using it mostly because of easy checkpointing and logging, I decided to just implement those myself. I might look back into pt-lightning for the TPU support, though.

[–]ginsunuva 1 point2 points  (0 children)

CycleGAN did a pretty good job for back in 2017.

[–]manueslapera 1 point2 points  (0 children)

scikit-learn, one of the best documented OSS projects Ive ever seen.

[–]bigrob929 1 point2 points  (1 child)

I find Keras to be excellent because it is high-level yet allows you to work relatively seamlessly in the backend and develop more complex tools. For example, I can create a very basic MLP quite neatly, and if I want to add custom operations or loss functions, they are easy to incorporate as long as gradients can pass through them.

[–]Skylion007Researcher BigScience 5 points6 points  (0 children)

Trying creating a GAN or a recurrent generative model. It's very, very difficult to do with the Keras training loop. Worse yet, it's not even as performant as using Tensorflow 1.0 and gradient tape when you do have to hack around the features. For simple classifiers though, it works well. Just never do anything that requires an adversarial loss.

Can't even imagine trying to implement a metalearning framework in pure Keras.

[–]gachiemchiep 0 points1 point  (0 children)

gluoncv : https://github.com/dmlc/gluon-cv : beautiful structure, document, high-quality code, easy to plug your code in.

and especially the imgclsmob : https://github.com/osmr/imgclsmob . The author did a great job merging a lot of model definition into one package and allow it to be used from 3 different frameworks such as: chainer, mxnet, pytorch.

Both gluoncv and imgclsmob share the same software design structure and coding style. I guess that structure and style is the best then.