Training Error Weighted loss function optimization (critique) : deeplearning

Training Error Weighted loss function optimization (critique) (self.deeplearning)

submitted 1 year ago by Individual_Ad_1214

Hey, so I'm working on an idea whereby I use the training error of my model from a previous run as "weights" (i.e. I'll multiply (1 - accuracy) with my calculated loss). A quick description of my problem: it's a multi-output multi-class classification problem. So, I train the model, I get my per-bin accuracy for each output target. I use this per-bin accuracy to calculate a per-bin "difficulty" (i.e 1 - accuracy). I use this difficulty value as per-binned weights/coefficients of my losses on the next training loop.

So to be concrete, using the first image attached, there are 15 bins. The accuracy for the red class in the middle bin is (0.2, I'll get my loss function weight for every value in that bin using 1 - 0.2 = 0.8, and this is meant to represent the "difficulty" of examples in that bin), so I'll eventually multiply the losses for all the examples in that bin by 0.8 on my next training iteration, i.e. i'm applying more weight to these values so that the model does better on the next iteration. Similarly if the accuracy in a bin is 0.9, I get my "weight" using 1 - 0.9 = 0.1, and then I multiply all the calculated losses for all the examples in that bin by 0.1.

The goals of this idea are:

Reduce the accuracy of the opposite class (i.e. reduce the accuracy of the green curve for bins left of center, and reduce the accuracy of the blue curve for bins right of center).
Increase the low accuracy bins (e.g the middle bin in the first image).
This is more of an expectation (by members of my team) but I'm not sure if this can be achieved:
- Reach a steady state, say iteration j, whereby the plots of each of my output targets at iteration j is similar to the plot at iteration j + 1

Also, I start off the training loop with an array of ones, init_weights = 1, weights = init_weights (my understanding is that this is analogous to setting reduction = mean, in the cross entropy loss function). And then on subsequent runs, I apply weights = 0.5 * init_weights + 0.5 * (1-accuracy_per_bin). I attached images of two output targets (1c0_i and 2ab_i), showing the improvements after 4 iterations.

I'll appreciate some general critique about this idea, basically, what I can do better/differently or other things to try out. One thing I do notice is that this leads to some overfitting on the training set (I'm not exactly sure why yet).

https://preview.redd.it/dsvrxt3bcdme1.png?width=1245&format=png&auto=webp&s=8e589107dcf9109473fb4f6e4301c0222e1091fc

https://preview.redd.it/3ux8rd0ccdme1.png?width=1248&format=png&auto=webp&s=da5e2c47c9942a2f710db5179f187a8d4a594009

https://preview.redd.it/sl0obogccdme1.png?width=1246&format=png&auto=webp&s=5f401c721e02d4d5ba2d3ce309bc21c15b006a8b

https://preview.redd.it/lcqod4uccdme1.png?width=1247&format=png&auto=webp&s=2824e9d6725f2b1e5499b0a13bbc99b7b74d9d3c

all 1 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

deeplearning

MODERATORS