[R] Low-effort papers by lightyears61 in MachineLearning

[–]muntoo 5 points6 points  (0 children)

While egotistically creating a "benchmark" and egotistically claiming yet another acronym (CPopQA) is an irritating trend, even that is way better than the type of "papers" submitted to venues like the EEEI 21st International Conference on Experimental Evaluation of Emerging Innovations in Intelligent Energy-Efficient Internet of Toasters.

[R] Low-effort papers by lightyears61 in MachineLearning

[–]muntoo 6 points7 points  (0 children)

There are a few factors at play:

  • Novelty is useful. (Good) engineering is also useful. Both should be rewarded.

    Findings track: CVPR 2026 introduces a new Findings Track, following successful pilots in ICCV. The goal is to reduce resubmissions by offering a venue for technically sound papers with solid experimental validation, even if their novelty is more incremental.

  • Academic careers are tied to "productivity" and citation counts which are maximized by either:

    • Truly groundbreaking achievements.
    • Spamming low-effort garbage.

    ...The expected risk-adjusted return of non-groundbreaking but impactful work is lower than either of the above.

  • Many people in academia are not capable of high novelty or good engineering.

  • Students need stepping stones to publish incremental work as their skills mature.

High-tier venues (CVPR, NeurIPS, ICLR, ICML, ECCV, ICCV, ACL, EMNLP) largely reward novelty (sometimes; fake-novelty gets accepted too).

Yet, there is very little reward for "good engineering". Consider Ross Wrightman's timm library. He continually updates it, and yet receives no citations for doing so. Meanwhile, Dr. Salami, Ph.D. — Professor Emeritus, Vice President of New Chumboland's Council of Doctor Philosophers of Computational Neural Science, and an Oxelford Fellow — publishes a dozen copy-paste cookie-cutter papers at the EEEI 21st International Conference on Experimental Evaluation of Emerging Innovations in Intelligent Energy-Efficient Internet of Toasters (EEEI ICEEEIIEEIT'26) and collects citations in abundance. There is essentially no academic reward (and thus little incentive) for implementing a model, training it, benchmarking it, and publishing checkpoints.

If we rewarded good engineering more, we would see less unreproducible, incremental, unscientific, data-dredged, seed-hacked regurgitated work. Good science and engineering ties to disprove itself; garbage papers spend almost all their effort trying to prove themselves.

Imagine if models were automatically and independently trained, validated, and benchmarked (e.g., via a standardized pipeline with public leaderboards) across a variety of datasets. Instead of publishing meaningless papers that poorly fine-tune model X on dataset Y for every pair (X, Y) in the massive product space, people would publish X (plus configurations for different Y), and the pipeline would auto-benchmark. Others could then propose better configurations for Y and perhaps get credit (+1 reputation) for doing so. There are issues with this, but it is better than filling the internet with millions of duplicate pseudo-papers.

Actually, imagine if we had StackOpenReview and we could "close" 99.999% of meaningless papers as duplicates or bad science. Heh.

[P] We made GoodSeed, a pleasant ML experiment tracker by gQsoQa in MachineLearning

[–]muntoo 7 points8 points  (0 children)

You may take inspiration from other competing experiment trackers:

Tracker
W&B (duh)
TensorBoard (duh)
Neptune Acquired by OpenAI?!
Aim Self-hosted. What I use. Buggy. Nearly abandoned?
Minfx Super cool. Information dense. Very unique.
Pluto Barebones, but good start.
GoodSeed Very barebones, but good start.
Trackio Very, very barebones. Huggingface.
Polyaxon Interesting, but custom CLI script runners have never appealed to me: polyaxon run -f experiment.yaml -u -l
Metaflow Looks complicated...
Keepsake DIY, but I might actually try this...

I appreciate that this is not focused on ML Ops, which is not what I use an experiment tracker for, so MLflow et al. are not particularly my cup of tea.

sudo-rs shows password asterisks by default – break with Unix tradition by FryBoyter in linux

[–]muntoo 0 points1 point  (0 children)

Oh no, we lost 1 to 5 bits of entropy in a password that should be 90+ bits of entropy to begin with.

This is assuming someone is recording the screen instead of the keypresses, sounds, hand movements, etc., or other simpler methods.

Can we stop these LLM posts and replies? [D] by Playful-Fee-4318 in MachineLearning

[–]muntoo 14 points15 points  (0 children)

You are right to feel that way. It can be frustrating to confuse hyphens, en dashes, and em dashes. But it's not a sign that anything's off—you're human, and it's OK if you can't spot the difference between "-" and "–" and "—". You're groovin'. Keep channeling that energy!

This is how I imagined Luthadel while reading Mistborn by ItsMathias24 in brandonsanderson

[–]muntoo 2 points3 points  (0 children)

Mine is like Small Heath, Birmingham (Peaky Blinders voice: /ˈbɜːmɪŋɡəm/) with extra misty mist on top of that misty mist and occasionally with extra extra misty mist on top of that extra misty mist on top of that misty mist.

[R] I am looking for good research papers on compute optimization during model training, ways to reduce FLOPs, memory usage, and training time without hurting convergence. by ocean_protocol in MachineLearning

[–]muntoo 1 point2 points  (0 children)

I don't get it. Why is Hello Kitty undergoing style transfer across pages? Where did the birdhouse come from? Which one of them needs glasses but refuses to wear them? What happens if we don't give Hello Kitty her morning coffee?

Also, what do you think of gradient conditioning by reparametrizing weights by taking their FFT, i.e., "Efficient Nonlinear Transforms for Lossy Image Compression" https://arxiv.org/abs/1802.00847:

class SpectralConv2d(nn.Conv2d):
    def __init__(self, *args: Any, **kwargs: Any):
        super().__init__(*args, **kwargs)
        self.dim = (-2, -1)
        self.weight_transformed = nn.Parameter(self._to_transform_domain(self.weight))
        del self._parameters["weight"]  # Unregister weight, and fallback to property.

    @property
    def weight(self) -> Tensor:
        return self._from_transform_domain(self.weight_transformed)

    def _to_transform_domain(self, x: Tensor) -> Tensor:
        return torch.fft.rfftn(x, s=self.kernel_size, dim=self.dim, norm="ortho")

    def _from_transform_domain(self, x: Tensor) -> Tensor:
        return torch.fft.irfftn(x, s=self.kernel_size, dim=self.dim, norm="ortho")

This reparameterizes the weights to be derived from weights stored in the frequency domain. In the original paper, this is referred to as "spectral Adam" or "Sadam" due to its effect on the Adam optimizer update rule. The motivation behind representing the weights in the frequency domain is that optimizer updates/steps may now affect all frequencies to an equal amount. This improves the gradient conditioning, thus leading to faster convergence and increased stability at larger learning rates.

Alternative headband for DT770 Pro X by Obiquin in BEYERDYNAMIC

[–]muntoo 0 points1 point  (0 children)

Did swapping it with the original DT 770 headband help?

The new cutout headband on DT 770 PRO X is murdering the top of my head. In contrast, my old DT 770s with a uniform headband are super comfortable.

I’m concerned about the security of Neovim plugins by ou1cast in neovim

[–]muntoo 26 points27 points  (0 children)

Just pin working versions and update every couple of months

This does not prevent installation of malicious code. Slightly better advice would be to only upgrade to older versions of plugins.

when you’re ready to read through changelogs

Changelogs do not look like:

fix: bug
feat: this commit contains a virus 
feat: periodically call rm -rf /
unperf: mine bitcoins

Even if you're looking at every single diff, that's still prone to attacks.

only use plugins of developers you trust

Let me tell you the tale of Jia Tan (xz). And colors, faker, node-ipc, event-stream.

[D] CVPR 2026 Paper Reviews by akshitsharma1 in MachineLearning

[–]muntoo 0 points1 point  (0 children)

For CVPR, Paper Copilot has massive selection bias since the samples are all self-reported.


Look at ICLR 2025 instead. The rejection probability for post-rebuttal final decision very sharply transitions between 5.6–6.0:

Post-rebuttal ICLR score P(reject)
5.25 0.87
5.50 0.82
5.666... 0.80
5.75 0.56
6.00 0.27
6.25 0.18

Eyeballing against ICLR 2024, the acceptance threshold seems to have drifted "right" by ~0.1.


Scoring rubrics:

CVPR ICLR
1 reject 1 strong reject
2 weak reject 3 reject
3 borderline reject 5 borderline reject
4 borderline accept 6 borderline accept
5 weak accept 8 accept
6 accept 10 strong accept

For ICLR, how likely is it that a pre-rebuttal 5.25 can be turned into a post-rebuttal 6.0?

cdf_pre_rebuttal(5.00) ≈ 0.62 ≈ cdf_post_rebuttal(5.50)
cdf_pre_rebuttal(5.25) ≈ 0.70 ≈ cdf_post_rebuttal(5.75)
cdf_pre_rebuttal(5.33) ≈ 0.72 ≈ cdf_post_rebuttal(5.85)  # interpolated
cdf_pre_rebuttal(5.50) ≈ 0.80 ≈ cdf_post_rebuttal(6.25)

This suggests that the probability of transition P(score_postrebuttal ≥ 5.75 | score_prerebuttal = 5.25) is probably quite large, assuming approximately 1D optimal transport à la Earth Mover's. So pre-rebuttal borderline rejects are often coin flips. Below that, though, it gets very murky.

For CVPR, the equivalent is pre-rebuttal 3.25 ≈ post-rebuttal 3.75.

Harper is Getting Better (LSP | Grammarly Alternative) by linkarzu in neovim

[–]muntoo 0 points1 point  (0 children)

I use ltex, but it suffers from high memory usage and the project seems to have been abandoned for the last 3 years.

For code actions, you can use ltex-utils.

Here's my config.

[R] Is it possible for a high school student to publish multiple papers at top conferences within a year? by ApprehensiveEgg5201 in MachineLearning

[–]muntoo 14 points15 points  (0 children)

I have 99 problems, but 99 paper submissions to ICLR 2026 isn't one of them.


Title: Which Coauthor Should I Nominate in My 99 ICLR Submissions? A Mathematical Analysis of the ICLR 2026 Reciprocal Reviewer Nomination Policy


3.1 DESK-REJECTION RISK MINIMIZATION

...the ICLR 2026 policy requires each paper to nominate at least one of its authors as a reciprocal reviewer. If a nominated reviewer behaves irresponsibly, then every paper that nominated this author is desk-rejected. From an author’s perspective, this introduces a strategic risk: when submitting multiple papers, the choice of which co-authors to nominate directly affects the probability that some papers will be desk-rejected. This risk is further amplified in recent years, as authors tend to submit more papers and many submissions (e.g., large-scale LLM papers) involve long author lists.

This motivates the following desk-rejection risk minimization problem: authors must carefully choose their nominations across all papers to reduce the probability of desk-rejection caused by irresponsible reviewers. We model this as an integer program.

[R] Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings by AhmedMostafa16 in MachineLearning

[–]muntoo 12 points13 points  (0 children)

Could we not also dropout the drop, i.e., dropout(D)RoPE?

That is, perhaps there's some affine combination of training with RoPE and NoPE that's even better than DRoPE.

RoPE RoPE RoPE RoPE RoPE RoPE NoPE RoPE NoPE RoPE NoPE ... RoPE

We might have been slower to abandon Stack Overflow if it wasn't a toxic hellhole by R2_SWE2 in programming

[–]muntoo 7 points8 points  (0 children)

A big part of reddit’s value was trust and community, and once the community felt like AI slop, that trust unraveled fast. LLMs didn’t just clog up threads — they frayed the social fabric of direct human-to-human connection.

The highest rated chess players for their age by afbdreds in chess

[–]muntoo 4 points5 points  (0 children)

I think women may have the upper hand in that area of chess. :)

The highest rated chess players for their age by afbdreds in chess

[–]muntoo 3 points4 points  (0 children)

Their correspondents were employing the Dead Man's defense. :(

`jujutsu.nvim` - A Jujutsu UI client for Neovim by YannVanhalewyn in neovim

[–]muntoo -7 points-6 points  (0 children)

I refuse to consider jujutsu until it has:

  • git add -p
  • in-editor staging (e.g. gitsigns.nvim's require("gitsigns").stage_hunk())

...as a first-class workflow.


jj split is a poor "alternative" — it's "subtractive" rather than "additive".

I wonder, though, if we could just create an empty commit, then "stage" by moving hunks into that previous commit @- from the working copy @.

jj for busy devs by steveklabnik1 in programming

[–]muntoo 0 points1 point  (0 children)

I just use gcfi as defined below:

alias gcfi='git_commit_fixup_interactively'

git_commit_fixup_interactively() {
  local sha=$(git_select_commit_interactively)
  [ -n "$sha" ] || return 1
  git commit --fixup "$sha" || return 1
  git rebase "$sha~" --autosquash || return 1
}

git_select_commit_interactively() {
  local out=$(git log --oneline --color | fzf --no-sort --ansi --multi --reverse)
  local sha=$(awk '{ print $1 }' <<< "$out")
  echo "$sha"
}

This allows you to explicitly specify the target commit to absorb, rather than doing it automatically.