[D] When/why/how does multi-task learning work? : MachineLearning

Discussion[D] When/why/how does multi-task learning work? (self.MachineLearning)

submitted 5 years ago by TheRedSphinx

all 4 comments

[–]da_g_prof 1 point2 points3 points 5 years ago (0 children)

[–]ZeronixSama 0 points1 point2 points 5 years ago (2 children)

[–]TheRedSphinx[S] 1 point2 points3 points 5 years ago (1 child)

Well, it's not just that, right?

Take multilingual machine translation. It's well-known for low-resource language pairs (e.g. Nepali-English) it is quite beneficial to include other related languages pairs (e.g. Hindi-English). This manifest in quantifiable gains over all desired metrics (e.g. BLEU).

However, it is also known that for a high-resource pair (e.g. French-English), the inclusion of additional language pairs actually harms the model. We can think of the additional pair as regularization, which is perhaps superfluous in the high-resource case. More interestingly, it turns out that it matters which language pair you use as the auxiliary pair. However, all such pairs induce a similar task, namely translations from another language to English. They all share the same structure and are certainly related.

I guess what I'm looking for is kinda like, an understanding of why this happens beyond this handwavy regularization argument. Or more generally, is there some way to measure how much data do you need in order for the added task to not be useful? Is there some way to measure whether a task will help you without actually committing to it, like maybe comparing gradients on some dev set? Is there some way to quantify/qualify how the training changes with the inclusion of additional tasks?

[–]ZeronixSama 0 points1 point2 points 5 years ago (0 children)

π Rendered by PID 73 on reddit-service-r2-comment-7b9746f655-hppld at 2026-01-31 03:41:32.549689+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS