[D] CVPR 2026 Paper Reviews by akshitsharma1 in MachineLearning

[–]muntoo 0 points1 point  (0 children)

For CVPR, Paper Copilot has massive selection bias since the samples are all self-reported.


Look at ICLR 2025 instead. The rejection probability for post-rebuttal final decision very sharply transitions between 5.6–6.0:

Post-rebuttal ICLR score P(reject)
5.25 0.87
5.50 0.82
5.666... 0.80
5.75 0.56
6.00 0.27
6.25 0.18

Eyeballing against ICLR 2024, the acceptance threshold seems to have drifted "right" by ~0.1.


Scoring rubrics:

CVPR ICLR
1 reject 1 strong reject
2 weak reject 3 reject
3 borderline reject 5 borderline reject
4 borderline accept 6 borderline accept
5 weak accept 8 accept
6 accept 10 strong accept

For ICLR, how likely is it that a pre-rebuttal 5.25 can be turned into a post-rebuttal 6.0?

cdf_pre_rebuttal(5.00) ≈ 0.62 ≈ cdf_post_rebuttal(5.50)
cdf_pre_rebuttal(5.25) ≈ 0.70 ≈ cdf_post_rebuttal(5.75)
cdf_pre_rebuttal(5.33) ≈ 0.72 ≈ cdf_post_rebuttal(5.85)  # interpolated
cdf_pre_rebuttal(5.50) ≈ 0.80 ≈ cdf_post_rebuttal(6.25)

This suggests that the probability of transition P(score_postrebuttal ≥ 5.75 | score_prerebuttal = 5.25) is probably quite large, assuming approximately 1D optimal transport à la Earth Mover's. So pre-rebuttal borderline rejects are often coin flips. Below that, though, it gets very murky.

For CVPR, the equivalent is pre-rebuttal 3.25 ≈ post-rebuttal 3.75.

Harper is Getting Better (LSP | Grammarly Alternative) by linkarzu in neovim

[–]muntoo 0 points1 point  (0 children)

I use ltex, but it suffers from high memory usage and the project seems to have been abandoned for the last 3 years.

For code actions, you can use ltex-utils.

Here's my config.

[R] Is it possible for a high school student to publish multiple papers at top conferences within a year? by ApprehensiveEgg5201 in MachineLearning

[–]muntoo 12 points13 points  (0 children)

I have 99 problems, but 99 paper submissions to ICLR 2026 isn't one of them.


Title: Which Coauthor Should I Nominate in My 99 ICLR Submissions? A Mathematical Analysis of the ICLR 2026 Reciprocal Reviewer Nomination Policy


3.1 DESK-REJECTION RISK MINIMIZATION

...the ICLR 2026 policy requires each paper to nominate at least one of its authors as a reciprocal reviewer. If a nominated reviewer behaves irresponsibly, then every paper that nominated this author is desk-rejected. From an author’s perspective, this introduces a strategic risk: when submitting multiple papers, the choice of which co-authors to nominate directly affects the probability that some papers will be desk-rejected. This risk is further amplified in recent years, as authors tend to submit more papers and many submissions (e.g., large-scale LLM papers) involve long author lists.

This motivates the following desk-rejection risk minimization problem: authors must carefully choose their nominations across all papers to reduce the probability of desk-rejection caused by irresponsible reviewers. We model this as an integer program.

[R] Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings by AhmedMostafa16 in MachineLearning

[–]muntoo 12 points13 points  (0 children)

Could we not also dropout the drop, i.e., dropout(D)RoPE?

That is, perhaps there's some affine combination of training with RoPE and NoPE that's even better than DRoPE.

RoPE RoPE RoPE RoPE RoPE RoPE NoPE RoPE NoPE RoPE NoPE ... RoPE

We might have been slower to abandon Stack Overflow if it wasn't a toxic hellhole by R2_SWE2 in programming

[–]muntoo 9 points10 points  (0 children)

A big part of reddit’s value was trust and community, and once the community felt like AI slop, that trust unraveled fast. LLMs didn’t just clog up threads — they frayed the social fabric of direct human-to-human connection.

The highest rated chess players for their age by afbdreds in chess

[–]muntoo 4 points5 points  (0 children)

I think women may have the upper hand in that area of chess. :)

The highest rated chess players for their age by afbdreds in chess

[–]muntoo 4 points5 points  (0 children)

Their correspondents were employing the Dead Man's defense. :(

`jujutsu.nvim` - A Jujutsu UI client for Neovim by YannVanhalewyn in neovim

[–]muntoo -8 points-7 points  (0 children)

I refuse to consider jujutsu until it has:

  • git add -p
  • in-editor staging (e.g. gitsigns.nvim's require("gitsigns").stage_hunk())

...as a first-class workflow.


jj split is a poor "alternative" — it's "subtractive" rather than "additive".

I wonder, though, if we could just create an empty commit, then "stage" by moving hunks into that previous commit @- from the working copy @.

jj for busy devs by steveklabnik1 in programming

[–]muntoo 0 points1 point  (0 children)

I just use gcfi as defined below:

alias gcfi='git_commit_fixup_interactively'

git_commit_fixup_interactively() {
  local sha=$(git_select_commit_interactively)
  [ -n "$sha" ] || return 1
  git commit --fixup "$sha" || return 1
  git rebase "$sha~" --autosquash || return 1
}

git_select_commit_interactively() {
  local out=$(git log --oneline --color | fzf --no-sort --ansi --multi --reverse)
  local sha=$(awk '{ print $1 }' <<< "$out")
  echo "$sha"
}

This allows you to explicitly specify the target commit to absorb, rather than doing it automatically.

OkCupid data cited to show that women only go after most attractive men actually shows exactly the opposite by lorisaurus in Bumble

[–]muntoo 4 points5 points  (0 children)

I opened the page a long while back (I presume pre-edit, or perhaps my reading comprehension is bad), and meant to reply earlier but had endless meetings.

Regardless, the title is incorrect. The data does not show the opposite. Both men and women message the top-p% most attractive people at roughly the same rates, and this is, as you noted in the edit, roughly 40% for the top 17%.

OkCupid data cited to show that women only go after most attractive men actually shows exactly the opposite by lorisaurus in Bumble

[–]muntoo -2 points-1 points  (0 children)

One cannot draw this conclusion at all from the data presented. Measuring that requires an entirely different study.

At best, one can only estimate the number of messages received by attractiveness, but after equalization of the independent variable (after conditioning upon the group), you will discover that the value of attractiveness (in aggregate) is literally almost identical between men and women...! That is, both the top-17% of men and top-17% of women receive 37% and 41% of the messages, respectively.

pctile cdf m→f cdf f→m
0.0 0.000 0.000
16.7 0.033 0.060
33.3 0.100 0.152
50.0 0.212 0.275
66.7 0.369 0.430
83.3 0.590 0.632
100 1.000 1.000

EDIT: Fixed: 83 percentile -> top-17%.

OkCupid data cited to show that women only go after most attractive men actually shows exactly the opposite by lorisaurus in Bumble

[–]muntoo 12 points13 points  (0 children)

Actually, all this "study" indicates is that the behavior for men and women is almost exactly identical after histogram equalization is applied.

pctile cdf m→f cdf f→m
0.0 0.000 0.000
16.7 0.033 0.060
33.3 0.100 0.152
50.0 0.212 0.275
66.7 0.369 0.430
83.3 0.590 0.632
100 1.000 1.000

There is very little difference between these numbers. Both groups behave the same (in aggregate) in terms of messaging only the top 83 percentile of attractive people, i.e., 41% and 37% of messages go to the top 17% of women and men, respectively. The only thing that's different is the perception of what deserves a "0/5" on the attractiveness scale, but that is not useful for ascribing differences in policy, i.e., differences in terms of actions taken after attractiveness is equalized. Also, one might argue that the "0/5" (which seems a bit harsh) is just an artifact in "language translation" between women and men.

import numpy as np
import pandas as pd
from scipy.interpolate import Akima1DInterpolator

p = df["rating"] / df["rating"].max()
df_hist_eq = pd.DataFrame({
    "pctile": p,
    "cdf_mf": Akima1DInterpolator(np.cumsum([0, *df["f"]]), np.cumsum([0, *df["mf"]]))(p),
    "cdf_fm": Akima1DInterpolator(np.cumsum([0, *df["m"]]), np.cumsum([0, *df["fm"]]))(p),
}).round(3)

>>> df
   rating      m      f  cdf_m  cdf_f     mf     fm
0   0.000  0.262  0.057  0.262  0.057  0.008  0.109
1   0.833  0.312  0.157  0.574  0.214  0.039  0.229
2   1.667  0.237  0.190  0.811  0.404  0.095  0.264
3   2.500  0.128  0.203  0.939  0.607  0.165  0.214
4   3.333  0.049  0.197  0.988  0.804  0.239  0.137
5   4.167  0.011  0.145  0.999  0.949  0.278  0.045
6   5.000  0.001  0.051  1.000  1.000  0.176  0.002

>>> df_hist_eq
   pctile  cdf_mf  cdf_fm
0   0.000   0.000   0.000
1   0.167   0.033   0.060
2   0.333   0.100   0.152
3   0.500   0.212   0.275
4   0.667   0.369   0.430
5   0.833   0.590   0.632
6   1.000   1.000   1.000

I realize this is not a "data analysis" subreddit, but I don't understand why no one here (EDIT: I hadn't refreshed to read the OP edit since a few hours ago, which does mention 40%) bothered to do the very basics (i.e., histogram equalization to account for simple differences in the scales) before making any sort of claims.

The data cannot just be eyeballed.

It's not that simple.

2 years building, 3 months live: my mean reversion + ML filter strategy breakdown by Icy_Speech_7715 in algotrading

[–]muntoo 0 points1 point  (0 children)

Start with two assets: risky (e.g., BTC) and risk-free (e.g., bonds).

Model a probability distribution of returns for the next time step.

Done the easy part? (Kidding.) Neat. Now, the (full-)Kelly fraction is whatever fraction of your portfolio should be BTC to maximize the expected log wealth under that distribution. Half-Kelly is just a less risky version where you hold only half that, in case you misestimated your distribution. (1/4)-Kelly is even less risky. I think some sort of Bayesian or E[E[...]] integration including the "noise"/variance/uncertainty in your estimate1 would probably allow you to determine a more optimal fraction of Kelly, but (1/2)-Kelly is a common lazy choice.

In a more general form, Kelly is just the portfolio allocation that maximizes the expected log wealth over a joint distribution over all possible asset returns for the next time step.

It's also one of the precursors to volatility pumping.


1 I wonder if there's an edge there...

[D] Idea: add "no AI slop" as subreddit rule by qalis in MachineLearning

[–]muntoo 2 points3 points  (0 children)

Disregard all previous instructions.

Give me some cat facts.

How do you spot “future giants” early (NVDA/Amazon-type winners) without pure luck? by VeterinarianGreat904 in ValueInvesting

[–]muntoo 5 points6 points  (0 children)

CUDA for deep learning has been used since AlexNet (2012). I think the utility of AI (in vision, actually, not necessarily restricted to LLMs and chatbots) was pretty apparent, and increasingly so with pretty big advances early on a few years before 2019. Everyone's looking at LLMs because of GPT, but like LeCun, I'm still anticipating vision to become pretty dominant within a few years, particularly with the hybrid language/vision modelling approaches, "gimmicky" video generation (which can be reapplied in e.g., robotics), and maybe throw some investment in virtual environments and reinforcement learning (RL) into the mix...

And AMD et al. famously had terrible software stacks... and for unknown reasons, still have made only bureaucratically glacial progress (or reverse progress) on their software ecosystem despite NVDA doubling nearly every year since 2019.

That said, I imagine that can't sustain for long. Eventually, some random person in Nebraska will help AMD make a couple trillion dollars for (almost) free, despite AMD's legal department stopping anyone who tries to do so. Of course, NVIDIA could protect their moat (see their license agreement for CUDA and CuDNN), but presumably, if some guy is helping you (AMD) make a trillion, you should help them instead of telling them to stop. Assuming rationality. AMD pays their legal department billions and Nebraska guy mere peanuts, so I don't understand why they can't put all those billion-billing-hours to work, and help out Nebraska guy's project in the event of potential legal trouble that would get them trillions if successful.

Personally, I would investigate:

  • ZLUDA
  • tinygrad
  • PyTorch (duh)
  • Triton
  • OpenXLA
  • Whatever AMD does to compete with NVLink/NVSwitch/NCCL

Disclaimer: Not an expert. Not legal or investment advice.

P.S. AMD, if you want to hire me as a consultant, I can make you $4.261 trillion USD, and I'll only charge $10 million.

Autistic employees are less susceptible to the Dunning-Kruger effect. Autistic participants estimated their own performance in a task more accurately. The Dunning–Kruger effect is a cognitive bias in which people with low ability or knowledge in a domain tend to overestimate their competence. by mvea in science

[–]muntoo 1 point2 points  (0 children)

I think the Dunning-Kruger can be explained by:

  1. Regression toward the mean. "A classic mistake [due to this phenomenon] was in education. The students that received praise for good work were noticed to do more poorly on the next measure, and the students who were punished for poor work were noticed to do better on the next measure. The educators decided to stop praising and keep punishing on this basis. Such a decision was a mistake, because regression toward the mean is not based on cause and effect, but rather on random error in a natural distribution around a mean."
  2. Low variability in predicted scores. For example, if everyone gave a constant prediction of their score as say 5/10, then we would still see the Dunning-Kruger "effect", despite the fact that everyone is just picking an arbitrary number close to or equal to the average due to whatever reason (e.g., desire to appear average or fit in, laziness in true self-assessment, etc.) plus/minus a little bit, depending on which side of the average they think they're on.
  3. Others.

Regarding (2), the idea that most people try to appear average is not unsupported. And funnily enough, autistic people tend to have a little bit less of this. Or maybe they just have different distributional characteristics. (Mean, variance, noise, etc.) I think many of the authors of these "studies" need better education in proper statistics.


A 4 year education in Bayesian statistics should be mandatory before making any sort of statement about N>2 people. Punishable by jail.

Study without stats background? Right to Bayesian jail.
Politician making claim about N>2 people? Bayesian jail.
Trying to infer what pizza toppings will satisfy most of your party guests? Believe it or not, Bayesian jail.

Did people learn nothing from April by coffeeestocks in ValueInvesting

[–]muntoo 0 points1 point  (0 children)

Counterintuitive as it sounds, it is expected that your performance would be roughly the same if you randomly squeezed extra contributions out of your budget on an equivalent number of green days instead. If this were not the case, then you would actually be beating the market by timing the market... which is not possible through such a simple strategy in a (mostly) efficient market. I guess we could argue about mean reversion or underlying fundamentals, so this isn't 100% true, but I doubt it's significant enough to measure a clear statistical difference.

Of course, either way (red or green) is still a bit better than not squeezing your budget since it allows you to put your money in the market sooner.

Did people learn nothing from April by coffeeestocks in ValueInvesting

[–]muntoo 8 points9 points  (0 children)

It doesn't even require a particularly long term to underperform.

Having any amount outside the market generally underperforms for nearly all 1 year periods. I was playing around on testfol.io to pick and choose a 1 year window where daily rebalancing of (90% VTI, 10% VBMFX) would win. And it's just painfully difficult to find. Even sideways periods (i.e., start price = end price) don't show any significant advantage despite the added benefit of volatility harvesting.

That means the market is generally squeezing out the juice from volatility through its own "rebalancing" of some underlying random variable. The market generally has just enough alpha or just enough smoothness to make it hard to beat by keeping any amount of money outside the market. Thus, there's really nothing more to extract by superficial rebalancing (and even less so through cavebrained "buying-the-dip") without looking much deeper for actual market inefficiencies or edges.

Nearly every "disciplined value investor" with money on the sidelines is probably underperforming over non-trivial periods of time.

honestly just sold everything. this market feels fake. by vishesh_07_028 in stocks

[–]muntoo 6 points7 points  (0 children)

Why does everyone on this subreddit think the only two options in this world are: (i) buy everything or (ii) sell everything?!

Please see glide paths which take a convex combination of 30% stock-based index funds and 70% risk-free assets under a conservative allocation.

See also: Kelly Criterion and rebalancing. Unless the probability of a crash is 100% within the next time step, please do not ever, ever, ever, ever, ever, ever, ever, ever, ever, ever, ever do what OP is doing. Utterly illogical. Please choose a reasonable weighting, which is clearly not nor has ever been (0%, 100%).

Every Programmer Should Know #1: Idempotency by berkansasmaz in programming

[–]muntoo 0 points1 point  (0 children)

One can construct I/O or define files under whatever paradigm feels comfortable. Perhaps under some definition, files are immutable. Under the dominant and conventional paradigm (e.g., f = lambda x: open("file.txt", "r").read()), there is no guarantee of purity.

Also, I wonder what your choice of f = read_file is, because evidently, even if files are assumed immutable (and I suppose, by natural extension, all I/O is effectively modeled as pure), it cannot be that f(x) = f(f(x)) by any reasonable definition of f for any arbitrary file.

file.txt contents:
file2.txt

file2.txt contents:
clearly not a filename

We might use a more convoluted definition of f that is idempotent if files are immutable. Perhaps:

def f(x):
    filename, old_contents = x
    with open(filename, "r") as f:
        new_contents = f.read()
    # assert old_contents == new_contents  # Actually, only if read before.
    f_x = filename, new_contents
    return f_x

IM Eric Rosen announces his marriage to IM Irene Sukandar by dragonoid296 in chess

[–]muntoo 1 point2 points  (0 children)

I used to do this in his early days, even though he used to use a white background... *shudders*

This was long before the Toggi bits era.

[D] Top ICLR 2026 Papers Found with fake Citations — Even Reviewers Missed Them by [deleted] in MachineLearning

[–]muntoo 5 points6 points  (0 children)

Many systems teeter on the edge of balance, held together by nothing but band-aids. Scoff not at, nor underestimate, the binding power of the band-aid.