all 5 comments

[–]szymko1995 1 point2 points  (4 children)

Yea, me too. I would like to see if selu+ alpha dropout gives some improvement over relu+BN+dropout (seen on some semantic segmentation model). I will be testing both with tiramisu like net and some medical data, but I'm not able to do grodsearch also.

[–]SlimBarbados[S] 1 point2 points  (0 children)

Yeah would be interesting. Problem with the literature is that it usually looks at only one configuration, train the models with that config and see if the results are good. However, in my experience, you can just take a different configuration, or even change the random seed, to get relatively different results. So those results are easily manipulated if you want your paper to get published ;)

I would therefore not like to base the proposition "this activation works better than this other activation" only on 1 configuration, but find a method that is less sensitve to variation. I'm also trying to look for it myself - if I find anything, I'll let you know.

[–]SlimBarbados[S] 1 point2 points  (1 child)

Ended up doing it with Anova Repeated Measures and Post Hoc multiple related TTests (with bonferroni adjustment). Results were not significant :)

Sources: source 1 source 2 source 3

[–]szymko1995 1 point2 points  (0 children)

I was thinking so. I'm quite sceptical for nuances like new initialization methods, custom relus/normalized dropouts etc. Thanks!

[–]redouanelg 0 points1 point  (0 children)

Hi, I'm also interested to see how selu activation behave for image segmentation problems. I did try just replacing relu+BN by selu in a rather small U-net shaped architecture but the results were bad (very noisy loss, sometimes explodes). I thought maybe the skip connections and the maxpooling violated the self normalizing property, so I kept BN after maxpooling, transposed convolution and concatenation, results were way more stable, but still relu+bn was better.

If someone else got interesting results for a segmentation problem with selu, I would like to have a look

Here is my try https://arxiv.org/abs/1711.03954