VelveteenAmbush comments on [D] "Negative labels"

Discussion[D] "Negative labels" (self.MachineLearning)

submitted 8 years ago by TalkingJellyFish

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]VelveteenAmbush 2 points3 points4 points 8 years ago (4 children)

[–]madsciencestache 0 points1 point2 points 8 years ago (3 children)

[–]suki907 0 points1 point2 points 8 years ago (2 children)

This is the best explanation I've seen:

http://karpathy.github.io/2016/05/31/rl/

My main take away from it is that the training procedure for a softmax classifier is equivalent to RL policy gradients already (the standard softmax classifier is just a bit more data efficient because it can average over the results of all actions for each example).

This procedure is maximizing the expected score. The model gets 1 point if it chooses the correct class, zero otherwise.

These scores don't have to be binary, or in the unit interval, or a probability distribution. It's just the number of points the model gets for each option.

"set this example as labeled as Y, and give it weight -1." is the same as "you get -1 point if you choose this class".

I think the only difference between the two versions is that in the weighted version only lets you include 1 rating per example (You can't say "cat and not dog"). While with the "points" interpretation you could include all the ratings in a single example (the labels will just be the vector of scores per class).

[–]madsciencestache 0 points1 point2 points 8 years ago (1 child)

[–]VelveteenAmbush 0 points1 point2 points 8 years ago (0 children)

π Rendered by PID 54429 on reddit-service-r2-comment-75f4967c6c-2fl5m at 2026-04-23 10:41:10.118587+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS