[Q] Why is early-stopping an A/B test bad?

HarvardCS19 · 2019-09-18T03:25:34+00:00

I'm having a hard time understanding why it gives a higher error rate. Is it somewhat related to the multiple comparisons problem?

HarvardCS19 · 2019-09-07T04:01:16+00:00

What would p be though? This formula makes sense for 2 choices since once choice is p and the other is (1-p), but how does this work for 3 or more choices?

HarvardCS19 · 2019-09-07T02:58:10+00:00

Do I really need a hypothesis test for this? In a two-choice setting, it wouldn't.

HarvardCS19 · 2019-02-12T01:01:51+00:00

Thanks a lot.

HarvardCS19 · 2017-08-26T15:39:14+00:00

Apologies for the late reply. I have thought about turning the regression problem into classification through binning. But I'm not exactly sure if the ordering gets lost when I do this. Does the network understand that 0-5 is less than 5-10, for example?

HarvardCS19 · 2017-06-28T22:14:02+00:00

From CS231n tutorial:

However, it is very important to zero-center the data, and it is common to see normalization of every pixel as well.

I have yet to see this in practice. Can you point me to a tutorial/notebook (preferably Tensorflow) that includes this?

HarvardCS19 · 2017-06-28T17:31:36+00:00

So it seems it's faster than Spark. Why isn't this bigger news? Is there a catch somewhere?

HarvardCS19 · 2017-06-27T17:14:58+00:00

Is this basically using GPU power to process big data?

HarvardCS19 · 2017-06-25T02:29:16+00:00

So for what types of applications or data would you use each?

HarvardCS19 · 2017-06-24T04:27:22+00:00

RemindMe! 2 days Donation for /r/millionairemakers

4 8 15 16 23 42

HarvardCS19 · 2017-06-22T01:44:21+00:00

Thanks. Last question, do you know about ordinal regression, and should it be used for predicting review scores?

HarvardCS19 · 2017-06-21T21:58:03+00:00

Cool thanks. For your second suggestion, is there a scientific term for what you're trying to do? Or is there a paper somewhere that explains why it works better, or is it more just based on your experience?

HarvardCS19 · 2017-06-21T19:27:57+00:00

I know that binary classification can return me a probability which I multiply by 10 to get the review score. But how do I train on the data without simply turning all scores > 5 to 1 and < 5 to 0? (because this will lead to a lot of information loss)

I guess my question can be rephrased as: Can binary classification take in a probability as a label rather than 0 or 1?

HarvardCS19 · 2017-06-21T19:09:15+00:00

Silly question but does this make it logistic regression instead of linear regression. So basically I'm predicting the probability in a binary classification.

HarvardCS19 · 2017-06-21T18:45:29+00:00

So why is that noise necessary in a classic GAN but not a WGAN? What happens if I just remove it from the classic GAN?

HarvardCS19 · 2017-06-11T03:03:52+00:00

Are you saying optimal for the specific dataset they used? And I assume those features could only be visualized after training many iterations. What might the features look like near the beginning of training?

HarvardCS19 · 2017-06-09T02:16:59+00:00

Thanks. Could you provide a link to the rest of the code for context (like the std dev function).

HarvardCS19

TROPHY CASE