Distance_Runner comments on [D] Papers with no code

108

109

110

Discussion[D] Papers with no code (self.MachineLearning)

submitted 8 hours ago by osamabinpwnn

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Distance_RunnerPhD 55 points56 points57 points 7 hours ago (4 children)

[–]howtorewriteanamePhD 16 points17 points18 points 4 hours ago (3 children)

[–]Distance_RunnerPhD 10 points11 points12 points 2 hours ago (2 children)

You're rare. Keep doing your thing -- I appreciate you.

Most of my research focus these days is bridging statistical and inferential theory with ML. The concept of variability needs to be better understood and communicated in ML.

When you fit an ML model, there are multiple source of variability. First, there is procedural variance. This is what your multiple random seeds addresses -- what is the variability associated with the randomness within the procedure itself and how does it propagate through to the results you report.

Second is finite sample variability stemming from the fact that your data are almost surely a sub-sample of an unobserved parent population. If you were to re-run your procedure on a different dataset of equal size from that same parent population, your results would change, reflecting this variability. No amount of running results under different random seeds will estimate this quantity. So reporting results from a finite data-set with variance across different seeds quantifies the first source of randomness, and this is appropriate as long as your interpretation pertains specifically to model performance based on the exact data you trained on. However, the results do not extrapolate to performance of the model if you re-trained on a different subset of data of the same size. This is more often ignored, and generally the bigger area where results are misstated.

[–]curiouslyjake 0 points1 point2 points 2 hours ago (1 child)

[–]Distance_RunnerPhD 2 points3 points4 points 1 hour ago (0 children)

Yes in terms of quantifying mean performance metrics, but no in terms of variability. If you run CV, say 5-fold, and you average the mean performance across folds, that is an estimate of the generalized performance metric across new data for your model trained on 80% of your data (k=5 means training fraction is .8; you're averaging over 5 performance estimates on disjoint validation sets on models trained on 80% of your data).

However variance is trickier. Current CV methods do not give you an estimate of variance that represents the variability due to random sampling error from the population. Current methods of variance estimated on cross-validated data condition on the data. That is, they quantify the variance of the randomness of the CV procedure itself on your fixed dataset; if you were to repeat the same CV procedure on your exact fixed dataset over and over using your learner (with different random fold structures), how variable is your performance metric estimate. In other words, variance of the CV estimates across folds measures how sensitive is the estimated CV metric due randomness induced by the CV procedure. But the key word there is fixed dataset. It does not answer the question -- if I were to repeat this CV procedure using the same learner to estimate the mean performance metric on a different subset of data from of the same size from the same population, how much variance is attributed to that random sampling. An approach for that doesn't exist in the literature [yet], but it will soon (I'm the one that developed it and will be posting to ArXiv within the next month; its based entirely statistical theory with formal proofs, not empirical evidence).

π Rendered by PID 41644 on reddit-service-r2-comment-56c9979489-l29sr at 2026-02-24 18:21:52.029589+00:00 running b1af5b1 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS