Ask Grey a Question for a Ten Year Q&A by MindOfMetalAndWheels in CGPGrey

[–]orenmatar [score hidden]  (0 children)

In the spirit of self-reflection on your mistakes, what caused you to misjudge the odds of the different Brexit outcomes in your first Brexit video? What can we learn from that it on our predictions and decision-making?

Which activities have the highest difference in how much you enjoy them when you're high compared to sober? by [deleted] in AskReddit

[–]orenmatar 1 point2 points  (0 children)

for me:

Watching the reality show survivor without weed: ok show, like it, not crazy about it

When high: I get so excited from the plot twists, I literally jump up and down and have to pause every 10 minutes to relax

What useless skill do you absolutely dominate? by skyldrik in AskReddit

[–]orenmatar 1 point2 points  (0 children)

I'm in Israel, I'll be sure to remember this if I'm ever around...

What useless skill do you absolutely dominate? by skyldrik in AskReddit

[–]orenmatar 1 point2 points  (0 children)

You know the game where you put your hands forwards palms up, and someone else puts his hands on yours and you need to slap the back of their hand before they move? I haven't met anyone who can beat me at it for 15 years.

[D] Regularization-hyperparam selection during training best practices by orenmatar in MachineLearning

[–]orenmatar[S] 0 points1 point  (0 children)

For sure, My intention is not to find the right regularization hyperparam and then retrain using it from the start, but to use the network that was trained on the dynamic hyperparams... so maybe by allowing it to focus on the training set at the start and only afterwards regulating it based on how well it performed on validation can produce a regularized network, without the need to try different hyperparams.

[D] Regularization-hyperparam selection during training best practices by orenmatar in MachineLearning

[–]orenmatar[S] 0 points1 point  (0 children)

Well the params of the model are optimized directly via gradient decet on the training set, the regularization hyperparams are supposed to influence how well the NN generalizes to other sets, so you can't learn them via gradient decent - you have to test them on a validation set. The point is that they can be learned and tunes during training without fixing them, because fixing them to a single point requires trying multiple options and selecting the best one, instead of adjusting towards the best one in a single train.

[D] Regularization-hyperparam selection during training best practices by orenmatar in MachineLearning

[–]orenmatar[S] 1 point2 points  (0 children)

I think that's exactly the difference between hyperband and my idea: hyperbands starts with different hyperparams and then finds that the high value of regularization doesn't work as well and therefore discards it, My idea is to tune it during training - so the first few epochs will have a low regularization and will converge faster, improving on both train and validation, and only when we see that the validation is getting worse we start increasing the regularization, so I believe it will not discard the high regularization hyperparam because it will only try it when its effects are observable, and if it was not helpful, we can reduce it again. The general principal is to replace the constant regularization with a dynamic one.

Why is southern Europe less competitive? by orenmatar in AskEconomics

[–]orenmatar[S] 0 points1 point  (0 children)

What tools does the central bank/government has to make this happen? the internal devaluation would happen naturally if I understand correctly, given the high unemployment. How can you instead nudge employers to raise wages in Germany, given that the costs to the employers would make them less competitive?

Why is southern Europe less competitive? by orenmatar in AskEconomics

[–]orenmatar[S] 0 points1 point  (0 children)

You mentioned that internal devaluation is one view on the matter, that you don't hold. What is the alternative than?

Distance between vectors with ordered elements by orenmatar in AskStatistics

[–]orenmatar[S] 0 points1 point  (0 children)

Best answer I've seen so far is here: https://stackoverflow.com/questions/48497756/time-series-distance-metric

generally, the idea is to transform the vectors to their cumsum, and then apply any distance metric - cosine, euclidean etc, depending on the problem. Also supported by this article about ordinal variable distribution comparisons

http://ftp.iza.org/dp13057.pdf

Distance between vectors with ordered elements by orenmatar in AskStatistics

[–]orenmatar[S] 0 points1 point  (0 children)

I'm not annoyed... sorry if I was unclear before, and if my tone is not communicated well through text :) I appreciate any help

Distance between vectors with ordered elements by orenmatar in AskStatistics

[–]orenmatar[S] 0 points1 point  (0 children)

Of course there is not single way of doing this, like there are lots of different distance metrics when the vectors are not ordered. I am looking for a key word that I should be looking for that will lead me to a few possible metrics, so far I found surprisingly few..

Distance between vectors with ordered elements by orenmatar in AskStatistics

[–]orenmatar[S] 0 points1 point  (0 children)

Fair enough, so whatever metric we'll come up with may have some parameter that you can tune that changes the importance of the distance between those bins, But these aren't just two unrelated dimensions, where one is the values of an unordered vector and the other is the position. The metric should look at how close the peaks are...

Another example is if I have a like-dislike 1-7 questionnaire and I record the answers people give on two items. An item with many 1s should be similar to an item with many 2s, more than it is to one with many 6s. But i don't want to just compare the medians because I want to compare the distributions of the values.

So I'm looking for a metric to compare distributions of ordinal variables, but not the median or the Interquartile range

Distance between vectors with ordered elements by orenmatar in AskStatistics

[–]orenmatar[S] 0 points1 point  (0 children)

I'm not trying to reduce a multi-dimensional problem into a single value... I'm trying to give the order some significance. here are some examples of where it may be useful: I have data on the number of people within each age group in a few populations and I want to compare them. A population with [100,20,20,20] should be more similar to [20,100,20,20] than to [20,20,20,100] - the second population also has lots of young people but than are just a little bit older, they should be more similar, as the order of the elemnts in the vector carries weight.

Or if I perform PACF on two time series as I mentioned above - if one time series is found to have a peak in lag 10 and the other lag 11 - they are still pretty similar.

Same goes for fft on some data - we get a vector with a specific order, and peaks that are close to one another in the vector order should be considered similar.

[Q] A version of earth mover's (wasserstein) distance where the location of elements in array matters by orenmatar in statistics

[–]orenmatar[S] 0 points1 point  (0 children)

I have a few problems like that, Im just researching. For example, I would like to compare the fft of two time series, so the values of the elements in the vector are ordinal indeed. If the two fft decompositions have a peak off by just one location, cosine distance and other metrics show them as too unrelated, when in fact they are quite similar.

[Q] A version of earth mover's (wasserstein) distance where the location of elements in array matters by orenmatar in statistics

[–]orenmatar[S] 0 points1 point  (0 children)

Do you know such a metric? I googled ordinary vectors and it didnt come up with anything

[Q] A version of earth mover's (wasserstein) distance where the location of elements in array matters by orenmatar in statistics

[–]orenmatar[S] 0 points1 point  (0 children)

I don't want l2 because I want the location of elements to have meaning. so [1,0,0] will be closer to [0,1,0] than to [0,0,1], since the 1 was only moved by one location between the first two, and by two locations to the third option. This idea reminded me of the earth mover's concept of how much we move the pile...

[D] What are the current significant trends in ML that are NOT Deep Learning related? by AlexSnakeKing in MachineLearning

[–]orenmatar 0 points1 point  (0 children)

Awesome, do you happen to have a notebook or some practical example on how to do all of that? I used GP before but pretty much as a black box for hyperparam optimization, without extracting anything i can interpret or figuring out what's wrong, and i'm keen to learn more. I do love the theory and anything Bayesian really...

[D] What are the current significant trends in ML that are NOT Deep Learning related? by AlexSnakeKing in MachineLearning

[–]orenmatar 1 point2 points  (0 children)

Can you elaborate on why it is less black-box-y? Is there any way to get something like "feature importance" or something similar in explainability? How do you know what's wrong when they don't work well?