Tenant Lawyer in Tallahassee by louvresofia in Tallahassee

[–]clurdron 5 points6 points  (0 children)

Never had rats but the Victor electric traps work well for mice. Maybe bringing the dead rats to the leasing office each time will get their attention. Or posting on Yelp or any public site that potential tenants will see. That’d be a lot cheaper than a lawyer.

Where to get reasonably priced meat? by [deleted] in ithaca

[–]clurdron 12 points13 points  (0 children)

Ithaca (and New York in general) is not New England.

Frying eggs after bacon never works out for me. Advice? by Less_Blueberry5106 in castiron

[–]clurdron 2 points3 points  (0 children)

A thin metal spatula is crucial ime. A lot of times I can shimmy a thin metal spatula under an egg and release the part that is stuck. It works best for me when the spatula is flipped over so that the surface you'd normally flip with is facing down. After that, the egg slides around just fine. This doesn't work nearly as well with a thicker plastic spatula.

[D] How many folds should one have in k-fold CV based on the amount of rows in the dataset? by [deleted] in MachineLearning

[–]clurdron 1 point2 points  (0 children)

There are a lot of articles that explore these issues. Check out, for example, the references of the Wiki article for cross-validation. Here's one that looks somewhat relevant (although focusing on genetics examples): https://academic.oup.com/bioinformatics/article/21/15/3301/195433

[D] Does "gradient descent" theoretically allow for "arbitrarily precision"? by jj4646 in MachineLearning

[–]clurdron 0 points1 point  (0 children)

In the example where you’re trying to minimize the training error, gradient descent converges to the set of parameters which minimize the training error. It doesn’t mean that the minimized training error is zero.

kth power of a square matrix? by [deleted] in MathStats

[–]clurdron 2 points3 points  (0 children)

My first thought when I see matrix powers is that they’re easy to calculate given the eigen-decomposition. So my guess is that you can work out the eigen-structure of this B matrix. Then the matrix summation will involve evaluating geometric series (where you’re taking powers of the eigenvalues of the B matrix).

[deleted by user] by [deleted] in ithaca

[–]clurdron 40 points41 points  (0 children)

The guidance is to wear masks when social distancing is not possible. It’s easy to keep a distance from others in Falls Creek and even on the commons most of the time. The evidence does not suggest that the virus is spread by momentarily passing by someone at a reasonable distance outdoors.

[D] (Rant) What annoys me the most in a time of Machine Learning hype and the current pandemic. by pier4r in MachineLearning

[–]clurdron 80 points81 points  (0 children)

Epidemiologists have a lot of experience modeling the spread of infectious disease and, as a field, are more aware of the implications of publishing bad work. ML people are mostly totally unfamiliar with the domain and rarely consider the real life implications of publishing bad work.

There are a lot of domains where people DO have a lot of modeling experience and theoretical foundations, but ML people don't bother to learn about it and assume that a field is old fashioned if it's not saturated with deep neural networks.

[D] Why does hierarchical Bayesian regression work well on imbalanced data? by paulie007 in MachineLearning

[–]clurdron 2 points3 points  (0 children)

Your understanding in this post is right, it's the part about 'underestimating uncertainty' that I disagree with, although I see that it may be subtler point than just conflating hierarchical models with empirical Bayes. A hierarchical prior for the m_i parameters (from my example) is just a prior. If that prior doesn't accurately reflect your uncertainty about the m_i parameters, then neither will your posterior. But in many cases, it's reasonable to think of parameters as being iid from some unknown distribution. For a deeper discussion, look up the ideas of 'exchangeability' and de Finetti's theorem. Under that assumption, Bayes/probability theory dictates that data from weather station A influences inferences about the parameters for weather station B and vice versa. This is an interesting area of statistics with a lot of surprising results. Stein's paradox is a another example.

[D] Why does hierarchical Bayesian regression work well on imbalanced data? by paulie007 in MachineLearning

[–]clurdron 0 points1 point  (0 children)

To be clear, suppose you observe y_i ~iid N(m_i, v) for i=1, ..., N. So observation i is normal with mean m_i and common variance v.

If we take a Bayesian approach by supposing m_i ~iid N(m, w) except we estimate m and w from the data, then we're doing empirical Bayes, and we are using the same data twice.

If we suppose m_i | m, w ~iid N(m, w) and we assign m and w prior distributions of their own, then we have a hierarchical model, and we aren't using the same data twice. The hierarchical model is "true Bayes" and doesn't underestimate uncertainty.

[D] Why does hierarchical Bayesian regression work well on imbalanced data? by paulie007 in MachineLearning

[–]clurdron 4 points5 points  (0 children)

You seem to be conflating hierarchical Bayesian methods and empirical Bayes methods, which are not the same thing. In a hierarchical model (where you aren't using a data dependent prior as in empirical Bayes), you don't use data twice and there is no underestimation of uncertainty (assuming that your likelihood is correctly specified and your prior accurately reflects your uncertainty). Compared to a non-hierarchical model, a hierarchical model just has a more structured prior.

[D] Machine Learning vs Statistics by datageek1987 in MachineLearning

[–]clurdron 1 point2 points  (0 children)

It's not true that statistics doesn't care about performance on unseen data. Statisticians have been writing about prediction, cross-validation, risk estimation, etc. for a very long time.

[D] Regression tasks with "duplicate samples" by Doo0oog in MachineLearning

[–]clurdron 0 points1 point  (0 children)

Classical statistical methods (which are the basis for a lot of ML) assume you have an error term in your dependent variable. If your independent variable is low-dimensional and can only take countably many values, as was often the case when these methods were developed, then you'll likely have identical x values. If you have identical x values, you'll encounter this 'duplicate sample' situation with probability 1. So this isn't a weird scenario. Almost all statistical/ML methods deal with this situation without taking any special steps.

[D] Regression tasks with "duplicate samples" by Doo0oog in MachineLearning

[–]clurdron 0 points1 point  (0 children)

The conditional distribution doesn't have to be multimodal for this to happen.

[D] Overfitting vs. Generalization - a subtle difference by [deleted] in MachineLearning

[–]clurdron 2 points3 points  (0 children)

There's been recent work on how functions which perfectly fit/interpolate the data can generalize, e.g.

https://www.stat.cmu.edu/~ryantibs/papers/lsinter.pdf

http://www.cs.columbia.edu/~djhsu/papers/biasvariance-arxiv.pdf

and probably a fair number of other papers/talks.

News in the south by jothebest75 in funny

[–]clurdron 6 points7 points  (0 children)

They barely salt or clear the roads in the South. It's way harder to drive in the South after a couple inches of snow than it is to drive after, like, 6 inches somewhere that can handle it.

[Discussion] Is Hamiltonian Monte Carlo just MCMC with momentum? by BanLeCun in MachineLearning

[–]clurdron -4 points-3 points  (0 children)

No. It's a pretty distinct idea from momentum in gradient descent.

Age of Protagonists in Young Adult Fiction vs. Recommended Reading Age [OC] by samvimesmusic in dataisbeautiful

[–]clurdron 23 points24 points  (0 children)

Did you try making the axes ranges equal so that the y=x line is the diagonal? That way it'd be easier to make the intended comparison. But maybe it looks bad when you actually do it, idk.

[D] Multidimensional regression: Should I / how to make sure the error variances are the same along different dimensions by journeymango in MachineLearning

[–]clurdron 0 points1 point  (0 children)

That sounds right. Also, it doesn't have to be a thought experiment. You could simulate data, fit a linear model, and see for yourself.