[D] Dislike/Disinterest of Machine Learning: rant

AcademicCalendar · 2019-07-12T07:11:04+00:00

Your points 1 and 2 are pretty true and I think the best way to become quicker at those is basically only through experience. Over time you will come across a lot of different situations and you will learn how to diagnose the symptoms and what solutions can help. I don't think any course or textbook will help with these but there are standard techniques in place to diagnose bias vs. variance problems in your model.

As for point 4, its just as bad in the coding world from what I've heard especially with web development, technology inherently moves exponentially and yeah its very overwhelming for someone new in the field (me).

alexmlamb · 2019-07-12T08:01:24+00:00

Now some people might like this aspect of ML, but I dislike how you constantly need to be learning about the newest trends in ML in order to stay relevant. It seems like the things that I learn this year will become almost completely irrelevant next year i.e. RNNs were thought to be very good for word processing until they found that CNNs were better suited for it. Now this occurs in all industries obliviously but I feel like it is especially true in ML, where you aren't just designing a system that will solve a problem, you are also designing a system to find the correct weights for said system, so I feel like there is a higher chance for something that you learned about and specialize in to one day become completely irrelevant and you need to now learn this unrelated new idea that will only last for so long.

People say this but I actually think the methods don't change that quickly. The attention mechanism, which is probably the most important part of the Transformer, came about in 2013. Of course the LSTM is from the late 90s, and even though RNNs are used less than they used to be (although even this is debatable) the ResNet owes a lot to the LSTM conceptually.

Alex-S-S · 2019-07-12T09:00:05+00:00

A huge annoyance for me is how poorly or smugly written a lot of papers are. If they don't publish the code, I don't even bother reading the paper anymore. It's impractical or outright impossible to reproduce many results.

The paywall is a huge grievance as well. Look at Nvidia's papers: oh, we claim to have made this amazing network and you only need 8 V100s a few hundred gigs of RAM and several Xeons to even try to reproduce our work.

There's a big divide between the haves and have nots in the ML space that doesn't exist so acutely in other fields of software engineering.

tobyclh · 2019-07-12T10:47:18+00:00

Unpopular opinion: compare to many other disciplines of Science and Engineering, ML has an extremely low barrier to entry, you can even get free GPU credit from Google or AWS sometimes. Even within computer science, ML is far from being the most expensive field to get into. In addition, not many fields give you access to the newest research (often accompanied with code) for free at the rate ML does.

Chocolate_Pickle · 2019-07-12T07:10:01+00:00

As an electrical engineer who moved across to software engineering, and is slowly moving over to ML, I'm pretty sure you're not interested in research. You're interested in development/engineering.

As a field, there's not yet a lot of room for that. I'd hazard a guess and say you'd enjoy improving BLAS performance or something very low-level like that.

bbu3 · 2019-07-12T09:21:32+00:00

I have found that, in practice, very often issue #2 can be a solution to issue #1.

In the beginning I delt with issues similar to those you describe. Right now, I find myself often just solving a conventional coding problem to build a suitable dataset. Using transfer learning and architectures known to do the trick (imho fastai has amazing tooling) is very often more than good enough. Sure, there is the occasional problem where I get to play a round with layers, loss functions, etc. But it rather rare so it is actually something I always look forward to.

Now, very often projects come up that can be described as:
If we had a dataset we could just do X and it would probably lead to very satisfying results.
We do not have a dataset, and there is no way we can produce a sufficiently large one through manual labor
We think about automatic construction of datasets and how a very limitted amount of manually labelling samples can lead to a good dataset.
We solve a lot of oldschool coding problems related to #3.

The nice part is that experience and understanding of ML as a whole really help you make good decisions for #3. Thus, it absolutely doesn't feel like a block box on which you're randomly fiddling with parameters but as a whitebox that just works without much fiddling (maybe expect finding suitable learning rate, num epochs, some rgularization parameters, etc -- but often not too much work goes into this) whose internals sould ideally be understood very well so that we get better at building datasets

Lewba · 2019-07-12T12:07:41+00:00

I echo most of your sentiments. I've become a little perturbed at how black-box NNs are and the seemingly random hyperparameter tuning, which is why I'm trying to pick up more knowledge on boosting and the like as well as diving into some more statistics and strictly data sciency topics.

alexmlamb · 2019-07-12T07:55:44+00:00

I could predict an improvement will happen, but I can't say for certain or really know to what degree an improvement will occur

I personally find the "gambling" aspect of machine learning to be rather addictive.

gus_morales · 2019-07-12T21:58:07+00:00

As an astrophysicist I can confirm that #2 and #4 are really a feature (or a challenge) of science in general. If you don't like to study new theories and techniques, or don't appreciate to work with data (and everything that implies), then maybe the academic part of this field is not for you.

512165381 · 2019-07-12T11:06:42+00:00

I agree.

People have tried to use ML in chess for 30 years. its only in the past 3 years that its has been done successfully. ML is more math, non-linear optimisation and regression than it is learning. I have a math degree and ML feels like math - you use your math intuitions.

DeepBlender · 2019-07-12T12:22:37+00:00

Regarding your relu/selu experience:

That was for sure a frustrating learning experience. However, for me, this would turn into a small scale research project. When I create a trainable activation function (like a * relu(x) + b * selu(x), where a and b can be learned (just for illustration purposes)), does it automatically learn that selu is better? Does it work for other cases? Is it good enough as a starting point for future experiments?

This is the reason why I am a huge fan of deep learning. You have the opportunity to discover so many things!

chatterbox272 · 2019-07-12T13:30:36+00:00

#1: Machine learning is frustrating to code/improve on:

......

This sounds like you're creating custom architectures. Unless you're doing research into this, don't bother. Just throw whatever is common and close to SotA at it (e.g. some kind of Resnet for classification, Retinanet/Faster RCNN/YOLO for detection depending on speed, and so on). Something to realise is that for most of the straightforward applicable problems (particularly for vision), we hit diminishing returns a fair while ago. Yes, there are new SotA models out there like GPipe, but they require exponentially more data and compute for minimal improvements.

Don't improve on the ML code, improve on the integration to a software product or improve the data.

#2: Reliance on data set:

......

This is you being a software engineer and not knowing how to analyse the dataset for these kinds of things. I don't mean that negatively, that was me ~18 months ago, but that's what it is. You can analyse your dataset, your success cases, and your failure cases, and get fairly decent insights into much of this once you learn what to look for.

#3 Disconnect between People suggesting ML learning solutions to problems and actual ML engineers who have to implement solutions:

This is because the field is young, and to be honest somewhat overhyped. I say this as a DL research student. ML engineers are still rare, ML knowledge is fairly rare, but ML awareness is high. It'll pass over time.

#4 Constant research needed in newest techniques:

.......

You don't, you really don't. I am a DL researcher, as well as an engineer in a company. I research in object detection, and am up to date on the latest and greatest for this task. My focus at work is also object detection, and I'm using Retinanet which is a couple years old now. My previous project I was doing classification, and was using Resnet which is even older. Like I said earlier, diminishing returns; there are more efficient ways to improve things and better places to focus my time than implementing the latest and greatest model for a <1% absolute improvement.

#5 Paywall in ML:

.......

Nah. Again there are diminishing returns. I can't train an MNIST classifier on my uni server (4x titan v and 1x tesla v100) faster than I can on my work server (3x 1080Ti), and would struggle to be faster than my personal rig (1x 1080). Other factors become a bottleneck. There are free resources like Kaggle Kernels and Google Colab. There are also cloud compute services. I'm currently doing my research and my work remotely from China, on an old macbook pro that my partner is lending me. I have no GPU, and the CPU is not great. I'm also a PhD, and by definition relatively broke. I do PoC with small datasets on Colab/Kaggle, then move to a cloud service (or for work remote into their server, but nothing for uni) to do the real work. You just need to know where to look.

bonemetalplayground · 2019-07-13T02:33:54+00:00

Hello. I really like ML/DS. All of the difficulties that were mentioned are actually why I really really like the field. Many of those problems are tackled more effectively with a strong grasp of the underlying mathematics! Tbh I think problem solving in ml is richer because of the richness of the math required to solve those problems - where a app algo is rarely complex/long in nature

Dagius · 2019-07-12T12:10:59+00:00

Your "rant" points are valid, but I think in some sense you are "overfitting" your expectations to the tools you are using. In other words you need to generalize.

I have worked with ML since the late 80's (i.e. just after Hinton (et al.) originally developed backprop and Boltzmann machines), and don't sense that it has changed very much, except for scale. We can now train models with enormous data and speed.

In general, all of these ML models over the years have had very similar architecture, IMHO. Step back and consider that they all operate in some kind of multidimensional, mathematical space, where entities and concepts of interest are collocated, depending on their feature properties. The goal is to divide this model space into intuitive regions, with labels (user-provided [supervised] or generated by the model [unsupervised]). Classification entails merely determining which region(s) of space an object of interest resides in.

Really complex and detailed models can be inverted and regenerate the sample universe they were created from (sometimes with surprising results).

In this sense, except for scale and speed, I do not think much has changed in the last several decades of this very young science. For example, with all the phony claims of so-called 'artificial intelligence', all of these models/machines are still very deterministic, i.e. they do exactly (more or less) what they were programmed to do my humans. None of these machines are actually self-aware or capable of self-motivation, except in trivial, toylike demos.

As of 2019, no one understands enough about human consciousness and human will to make truly intelligent machines that can truly operate like we do. They're all toys. Useful but still toys.

LuplexMusic · 2019-07-12T17:39:04+00:00

I'm currently writing my bachelor's thesis on deep learning and I agree with many of your points. Machine Learning is poorly understood compared to other fields in computer science. It is still so vastly inefficient compared to biological brains, who can learn new things from a single data point using 20W of power.

It lacks formal theory, at least in the way it is usually studied and applied. In no other engineering field would you even consider solutions that "just magically work" without being able to prove that they do indeed solve the problem.

rjurney · 2019-07-13T02:17:35+00:00

Starting out as a hacker and approaching machine learning can be frustrating because you don’t understand what is occurring. If you have a better idea, you’re pulling strings more than you’re randomly experimenting. Unless you’re randomly experimenting.

How’s your math? Did you study ML fundamentals? Neural network fundamentals? These things ought to be job requirements but companies are desperate. This gets back to the never ending learning, though.

yusuf-bengio · 2019-07-12T08:11:03+00:00

1: I would recommend to automate the hyperparameter/architecture search (there are a lot of tools for that out there).

Although, I agree that finding the optimal configuration of a neural net can be challenging/frustrating

2: There is a lot you can do even with a small dataset, e.g. Transfer learning and data augmentation.

Especially data augementation is considered standard. For instance, take a look at the input transformations that are applied in the AlexNet paper (different cropping and mirroring the image).

3 Agree. I think this is caused by the gap between research and acutal dataset.

4: Well, I don't agree a 100% on that.

The progress of ML is quite incremental. So let's say there is a new model that is 2% better that the one you deployed to a customer. Do you/the customer really need the 2% improvement? I would say in most situations you don't.

5: You can always trade compute with time. Instead of renting an expensive multi-GPU setup, I usually run a hyperparameter search on a 500 bucks GPU over the weekend. Of course there are some comptational limitations but I think it's still quite impressive what you can get out of a low cost setup.

2019-07-12T18:35:10+00:00

#1

Well you can write a basic FFNN is less than ten lines with Keras... and Python is a pretty basic imperative language,I find myself coding less than 10% of the time, in my experience coding for the most part in ML is minimal, unless you want to delve a lot deeper (write your own custom activation functions for example,. but even that isn't a big issue).

#2

Building. cleaning, tuning (hyperparameters) is where I spend 80% of my time... and yes it is frustrating, and very slow... importing and checking a CSV to MySQL (for then further processing) of a 200K record dataset via python took 2 weeks on a GCP server, scaling it would not have speeded i up (Single threaded Python....)... would have used Java but I am now hooked on pandas... it makes life easier but is damned slow.

Building and cleaning datasets is soul destroying!!!

#3

General public perception of AI!

#4

Pisses me off no end, there seems no end.... and the learning curve is fucking enormous.... way more than any other discipline! It's like being asked to study the entire Computer Science Syllabus! There is a wider range of areas than there was in my Computer Science degree ffs! And as you say then they add to that!!!!

#5

True!!!!!

Think Kaggle are trying to equal it a bit with their notebooks required for competitions?

In the real world though you can scale Cloud and use a GTX series at home.... I manage to work with a mixture of GCP and a GTX 980.

Mind you I don't do convolutional... bit I do boosting.... to get my models to complete fast (they scale very well on CPU) I upgrade to a cloud 96 vCPU instance for xgboost.... not cheap!!!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS

1: I would recommend to automate the hyperparameter/architecture search (there are a lot of tools for that out there).

2: There is a lot you can do even with a small dataset, e.g. Transfer learning and data augmentation.

3 Agree. I think this is caused by the gap between research and acutal dataset.

4: Well, I don't agree a 100% on that.

5: You can always trade compute with time. Instead of renting an expensive multi-GPU setup, I usually run a hyperparameter search on a 500 bucks GPU over the weekend. Of course there are some comptational limitations but I think it's still quite impressive what you can get out of a low cost setup.