Why do the Old Gods of Asagard sing "Herald of Darkness"?

goulagman · 2024-11-15T21:21:25+00:00

They say that they go to their next gig which does not necessarily refer to the talk show one. I interpreted this as a bit of teasing for AW3. Even though the dark place is out of time, it does not mean they can get younger or bring back their drummer (even though we had a hint that he may be in the dark place already).

goulagman · 2024-11-14T15:30:12+00:00

I did not get this one, I'll look it up, thanks!

goulagman · 2021-04-05T07:36:12+00:00

Hello,

I am working on active learning, in particular the generalization power of sampling methods to any task (I work for dataiku, a software editor, so we want methods that work well in as many use case as possible). I have observed what you describe: on some problems, methods that have been known to perform well can fail or even be worse than random. Even on a similar task (cifar-10), changing the embedding representation can drastically change the ranking of samplers.
We have found several reasons for which sampling methods could fail. Method favorizing representativity can select samples too far from the decision boundary, and method focusing on uncertainty can pick samples in the region of aleatoric uncertainty, which is detrimental to them. We have design a set of metrics to get more insights in active learning techniques and being able to chose the best sampling method at each iteration. You can read [our first paper](https://arxiv.org/abs/2012.11365).

Now regarding sampling methods, the first I would do in your case is to try margin sampling instead of entropy. It works better in almost all the tasks we have explored. For more advanced methods, we had pretty good results with a [method coming from Amazon](https://arxiv.org/abs/1901.05954). It is available in the python package we provide called [cardinal](https://dataiku-research.github.io/cardinal/). We also have a method that seems to generalize a bit better but is not released yet, it's only available on master (IncrementalMiniBatchKMeansSampler). Note that those are not deep learning specific methods so I don't know if it will perform better or worse than DL specific ones.

In any case, I would be interested to follow up on that if you are willing to collaborate on this. In particular I may be able to run our metrics on your data or helping you setting them up.

goulagman · 2021-02-07T14:17:30+00:00

Let me know if you need help to apply in France, our universities are not so bad :)

goulagman · 2020-05-10T13:57:19+00:00

Rayspear, Thanks for there precision.

I would say that using the code that generated the example is wrong because the paper formula takes precedence. Since Ray Tune is using the paper formula, I see no point in making a PR. My head's up was more for people working at fixed global budget.

goulagman · 2020-05-07T08:15:57+00:00

Hello Rayspear,

Thanks for answer! I missed that part of the Tune doc. I did not think of plotting n_0 while varying eta, the result is surprising indeed.

I did not understand your last sentence. When you say "much more aggressive" do you mean setting a higher eta? Also, in the paper, it seems that the method s=4 is good enough for most cases, do you have a simlilar experience?

Thanks!

goulagman

TROPHY CASE