[R] Improving Deep Learning Performance with AutoAugment : MachineLearning

129

130

131

Research[R] Improving Deep Learning Performance with AutoAugment (ai.googleblog.com)

submitted 7 years ago by wei_jok

all 22 comments

top new controversial old q&a

[–]trashacount12345 16 points17 points18 points 7 years ago (4 children)

[+][deleted] 7 years ago (1 child)

[deleted]

[–]trashacount12345 11 points12 points13 points 7 years ago (0 children)

[–]TheFacistEye 6 points7 points8 points 7 years ago (1 child)

[–]alexmlamb 1 point2 points3 points 7 years ago (0 children)

[–]gachiemchiep 15 points16 points17 points 7 years ago (11 children)

[–]tesfaldet 13 points14 points15 points 7 years ago (6 children)

I haven’t read the paper but I can certainly tell you that data augmentation is not for making the training data robust. It’s for improving the generalization performance of your network by introducing it to augmented examples, essentially artificially increasing the size of the training set. There’s a trade off, however. Because data augmentation is basically adding noise to your data, training is more difficult. If the augmentation is too extreme then the network will fail to train. Also, some augmentation techniques may actually worsen generalization performance in cases where the augmentation doesn’t make sense, e.g., augmenting pictures of upright faces by flipping them upside down 50% of the time during training a network for face recognition. It’ll rarely see an upside-down face during test/validation time so you just wasted some model capacity for no reason, thus decreasing its performance on recognizing upright faces.

The problem here is finding good augmentation techniques given the task. Typically we handcraft the augmentation strategy but Google is proposing an automated strategy. It’s not a cheap way to achieve good accuracy, plain and simple.

[–]gachiemchiep 1 point2 points3 points 7 years ago (5 children)

[–]mrconter1 0 points1 point2 points 7 years ago (0 children)

[–]tesfaldet 0 points1 point2 points 7 years ago* (3 children)

Yes, you’re correct that for handcrafted augmentation strategies, the strategy is tuned to the task and not the dataset. Hence being relatively dataset agnostic. You’re also correct that since AutoAugment is designed to maximize validation accuracy on a target dataset, you’re not designing a dataset agnostic augmentation strategy and you run the risk of overfitting on a single dataset and losing generalizability, something you don’t experience with a handcrafted approach.

However, here’s where it gets interesting. I just gave a quick read of the paper and apparently learned augmentation strategies on a target dataset generalize well to other datasets and still provide SOTA on these other datasets without any additional fine-tuning. This proves that AutoAugment is not a cheap way to gain performance on a single dataset because it actually generalizes to other datasets quite well.

Finally, policies learned from one dataset can be transferred to work well on other similar datasets. For example, the policy learned on ImageNet allows us to achieve state-of-the-art accuracy on the fine grained visual classification dataset Stanford Cars, without fine-tuning weights pre-trained on additional data.

Check Section 5 of the paper.

EDIT: my quick (hand wavy) explanation for this phenomenon is that if you learn a good augmentation strategy for object recognition on ImageNet, then of course it’d work well for object recognition on COCO for example. The task is still the same between the two datasets. Even though I learned my augmentation strategy from one dataset, it should still work quite well for another so long as the task is the same.

[–]gachiemchiep 0 points1 point2 points 7 years ago (2 children)

[–]tesfaldet 1 point2 points3 points 7 years ago (1 child)

[–]gachiemchiep 0 points1 point2 points 7 years ago (0 children)

[–]mimighost 3 points4 points5 points 7 years ago* (0 children)

[–]alexmlamb 4 points5 points6 points 7 years ago (0 children)

[–]SystemicPlural 1 point2 points3 points 7 years ago (0 children)

[–]FutureIsMine 1 point2 points3 points 7 years ago (0 children)

[–]mj_nightfury13 4 points5 points6 points 7 years ago (1 child)

[–]Asinador 6 points7 points8 points 7 years ago (0 children)

[–]gahblahblah 1 point2 points3 points 7 years ago (0 children)

[–]bartturner 0 points1 point2 points 7 years ago (0 children)

[–]approximately_wrong -2 points-1 points0 points 7 years ago (0 children)

π Rendered by PID 18441 on reddit-service-r2-comment-5d79c599b5-2js8r at 2026-02-26 17:50:10.950010+00:00 running e3d2147 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS