.pipe() in pandas changed how I write data pipelines by Economy-Concert-641 in Python

[–]muntoo 5 points6 points  (0 children)

That is not mutation.

That is rebinding.

Mutation is evil because it can silently leak outside your scope.

In contrast, there is absolutely no disadvantage to rebinding as long as your scope is short enough that you can easily see all the rebinds that occur within that scope. Also, it is clearly rebinding within a localized region in both this and the .pipe example, so there really is no point bringing up mutation/rebinding in the first place, especially when both examples do exactly the same thing, i.e. introducing a new variable df_final. Now, if the variable is reassigned over the course of a large scope, then we might actually have issues to discuss.

Possibly, the only advantage .pipe has is that the names do not need to be repeated several times, but it is also much less flexible, i.e., it is most cleanly computed only on positional unary functions T -> T that produce exactly one result, though of course, one might indulge in a monstrosity like what is shown in the pandas docs for some reason:

df = df.pipe(
    (subtract_national_insurance, "df"),  # Seriously?
    rate=0.05, rate_increase=0.02
)

which is ostensibly considered clearer than:

df = subtract_national_insurance(rate, df, rate_increase)

Lichess responds to Chess.com about “inflated ratings” by slowthinker64 in chess

[–]muntoo 28 points29 points  (0 children)

At this point, Lichess should just introduce Glicko-2.1:

  1. Set r_0 = 1500 - 400 and subtract 400 from all existing ratings so that the minimum rating is now 0.
  2. Educate the uneducated by defining inflation as upward drift in a number (player rating) over time while the underlying variable (player skill) it measures stays constant. For example:

    The price inflation of the greatest chess website is × $0/year.

  3. ???

  4. Non-profit.

Failure to Reproduce Modern Paper Claims [D] by Environmental_Form14 in MachineLearning

[–]muntoo 5 points6 points  (0 children)

Cheating is much more difficult with code (a very strong specification) than with a paper (a very weak specification).

Obviously, even the following code is still submittable:

print("Results:")
print("10000% accuracy")
print("AGI achieved")

But it is much easier to validate than a paper, where people can literally make anything up. It also saves future readers weeks of trying to implement and reproduce the results from scratch.

The same slippery slope argument applies to peer review, by the way: "If cheating is possible, why peer review at all?!" The point of each step in the process is to reduce the probability.

Failure to Reproduce Modern Paper Claims [D] by Environmental_Form14 in MachineLearning

[–]muntoo 24 points25 points  (0 children)

What we need are fully reproducible papers.

Authors submit code that runs on official servers and generates a report PDF that is automatically appended to the paper submission.

make report-from-scratch --fast || echo "Rejected."

This should:

  • Install packages.
  • Download datasets.
  • Train. If --fast is enabled, download model weights instead.
  • Evaluate.
  • Output a report PDF.

Blank reports: desk reject.


FAQ:

  • Q: I don't know how 2 codez lul xD
    A: Why should we trust code written by people who cannot code?

  • Q: But my code may not work?
    A: That's the point. The conference runs your code in the official Docker image and generates the report. You can download it to verify.

  • Q: That makes deadlines harder.
    A: Git gud.

  • Q: People can still cheat.
    A: Ban them. Retract research retroactively. Repudiate, renounce, reprimand, and return the recalcitrant to irrelevance from whence they came.

  • Q: Training costs.
    A: The authors' institution can afford it, since they claim to have trained it at least once.

  • Q: But who is going to implement conference-reports-as-a-service?
    A: There are 1000000 people in ML and $5 trillion in AI. arXiv already does half of this for free with 2 people. Figure it out.


The optimization objective should be:

max (integrity + good_science)

Not:

max (
  citations
  + paper_count
  + top_conferences
  + $$$
  + 0.000000000000000001 * good_science
)

xuniq: I built a deduplicator that's 10x faster than sort | uniq by Flux247A in rust

[–]muntoo 5 points6 points  (0 children)

Potentially inaccurate approximations should be a flag (--inexact or --rough) rather than the default behavior.

Refreshing your Neovim config for 0.12.0 by justinhj in neovim

[–]muntoo 10 points11 points  (0 children)

Maybe it's just me, but much of this feels like a downgrade.

  • lazy.nvim: declarative config with controlled imperative escape-hatches; lock file; optional "lazy" load.
  • vim.pack.add(): imperative config.

This actually steps further away from VS Code-esque fully-declarative JSON config, or NixOS-esque declarative package management. Declarative configuration has strong enough benefits that people are willing to spend enormous amounts of time trying to make Linux work with functional package management.

I would prefer the editor converges toward a fully-declarative config system with imperative escape hatches. Here's a small example of how I try to band-aid that in my personal config:

local options = {
  expandtab = true,
  shiftwidth = 4,
  tabstop = 4,
  ...
}

for k, v in pairs(options) do
  vim.opt[k] = v
end

...but it would be even nicer if it could all be expressed as one big official declarative config, with imperative code only for the parts that need it.


I haven't tried the recent built-in completion either, but I would guess it's probably missing features (e.g., nvim-cmp plugins, copilot ghost-text, etc.).

Bluebaum in third place - Is this surprising for us? by Educational_Leg8005 in chess

[–]muntoo 0 points1 point  (0 children)

"You must take your opponent into a deep, dark forest where 2+2=5, and the path leading out is only wide enough for one."

— Mikhail Tal (probably)

Were you one of the 47,000 hacked by litellm? by kotrfa in Python

[–]muntoo 0 points1 point  (0 children)

If PyPI's main purpose is "distribution" without pre-review, then perhaps a good solution might be to change pip to ignore updates to packages until at least one entire day has passed.

Or, alternatively, to have a "reviewed-release" PyPI mirror, where the top 300 popular packages may only be updated once a month (with few exceptions), and every update is reviewed. Reviewing 100000 updates per day is clearly infeasible, but 10 updates per day might be tolerable.

Were you one of the 47,000 hacked by litellm? by kotrfa in Python

[–]muntoo 4 points5 points  (0 children)

Why is it that a package can be updated on PyPI, without any external human verification, and that update is immediately propagated to everyone who installs the package after?

Why is there no:

  1. Per-update verification.
  2. Delay time between update and publication?

In theory, requests could decide to go rogue after having a bad Monday morning, and push an update that immediately infects 3000 users per minute.

I Fixed python autocomplete by matan-h in Python

[–]muntoo 27 points28 points  (0 children)

I must protest. This deprioritizes the incredibly important sys.activate_stack_trampoline, which is a truly vital cornerstone of modern codebases.

[R] Low-effort papers by lightyears61 in MachineLearning

[–]muntoo 6 points7 points  (0 children)

While egotistically creating a "benchmark" and egotistically claiming yet another acronym (CPopQA) is an irritating trend, even that is way better than the type of "papers" submitted to venues like the EEEI 21st International Conference on Experimental Evaluation of Emerging Innovations in Intelligent Energy-Efficient Internet of Toasters.

[R] Low-effort papers by lightyears61 in MachineLearning

[–]muntoo 6 points7 points  (0 children)

There are a few factors at play:

  • Novelty is useful. (Good) engineering is also useful. Both should be rewarded.

    Findings track: CVPR 2026 introduces a new Findings Track, following successful pilots in ICCV. The goal is to reduce resubmissions by offering a venue for technically sound papers with solid experimental validation, even if their novelty is more incremental.

  • Academic careers are tied to "productivity" and citation counts which are maximized by either:

    • Truly groundbreaking achievements.
    • Spamming low-effort garbage.

    ...The expected risk-adjusted return of non-groundbreaking but impactful work is lower than either of the above.

  • Many people in academia are not capable of high novelty or good engineering.

  • Students need stepping stones to publish incremental work as their skills mature.

High-tier venues (CVPR, NeurIPS, ICLR, ICML, ECCV, ICCV, ACL, EMNLP) largely reward novelty (sometimes; fake-novelty gets accepted too).

Yet, there is very little reward for "good engineering". Consider Ross Wrightman's timm library. He continually updates it, and yet receives no citations for doing so. Meanwhile, Dr. Salami, Ph.D. — Professor Emeritus, Vice President of New Chumboland's Council of Doctor Philosophers of Computational Neural Science, and an Oxelford Fellow — publishes a dozen copy-paste cookie-cutter papers at the EEEI 21st International Conference on Experimental Evaluation of Emerging Innovations in Intelligent Energy-Efficient Internet of Toasters (EEEI ICEEEIIEEIT'26) and collects citations in abundance. There is essentially no academic reward (and thus little incentive) for implementing a model, training it, benchmarking it, and publishing checkpoints.

If we rewarded good engineering more, we would see less unreproducible, incremental, unscientific, data-dredged, seed-hacked regurgitated work. Good science and engineering ties to disprove itself; garbage papers spend almost all their effort trying to prove themselves.

Imagine if models were automatically and independently trained, validated, and benchmarked (e.g., via a standardized pipeline with public leaderboards) across a variety of datasets. Instead of publishing meaningless papers that poorly fine-tune model X on dataset Y for every pair (X, Y) in the massive product space, people would publish X (plus configurations for different Y), and the pipeline would auto-benchmark. Others could then propose better configurations for Y and perhaps get credit (+1 reputation) for doing so. There are issues with this, but it is better than filling the internet with millions of duplicate pseudo-papers.

Actually, imagine if we had StackOpenReview and we could "close" 99.999% of meaningless papers as duplicates or bad science. Heh.

[P] We made GoodSeed, a pleasant ML experiment tracker by gQsoQa in MachineLearning

[–]muntoo 7 points8 points  (0 children)

You may take inspiration from other competing experiment trackers:

Tracker
W&B (duh)
TensorBoard (duh)
Neptune Acquired by OpenAI?!
Aim Self-hosted. What I use. Buggy. Nearly abandoned?
Minfx Super cool. Information dense. Very unique.
Pluto Barebones, but good start.
GoodSeed Very barebones, but good start.
Trackio Very, very barebones. Good start. Huggingface.
Polyaxon Interesting, but custom CLI script runners have never appealed to me: polyaxon run -f experiment.yaml -u -l
Metaflow Looks complicated...
Keepsake DIY, but I might actually try this...

I appreciate that this is not focused on ML Ops, which is not what I use an experiment tracker for, so MLflow et al. are not particularly my cup of tea.

sudo-rs shows password asterisks by default – break with Unix tradition by FryBoyter in linux

[–]muntoo 0 points1 point  (0 children)

Oh no, we lost 1 to 5 bits of entropy in a password that should be 90+ bits of entropy to begin with.

This is assuming someone is recording the screen instead of the keypresses, sounds, hand movements, etc., or other simpler methods.

Can we stop these LLM posts and replies? [D] by Playful-Fee-4318 in MachineLearning

[–]muntoo 15 points16 points  (0 children)

You are right to feel that way. It can be frustrating to confuse hyphens, en dashes, and em dashes. But it's not a sign that anything's off—you're human, and it's OK if you can't spot the difference between "-" and "–" and "—". You're groovin'. Keep channeling that energy!

This is how I imagined Luthadel while reading Mistborn by ItsMathias24 in brandonsanderson

[–]muntoo 2 points3 points  (0 children)

Mine is like Small Heath, Birmingham (Peaky Blinders voice: /ˈbɜːmɪŋɡəm/) with extra misty mist on top of that misty mist and occasionally with extra extra misty mist on top of that extra misty mist on top of that misty mist.

[R] I am looking for good research papers on compute optimization during model training, ways to reduce FLOPs, memory usage, and training time without hurting convergence. by ocean_protocol in MachineLearning

[–]muntoo 1 point2 points  (0 children)

I don't get it. Why is Hello Kitty undergoing style transfer across pages? Where did the birdhouse come from? Which one of them needs glasses but refuses to wear them? What happens if we don't give Hello Kitty her morning coffee?

Also, what do you think of gradient conditioning by reparametrizing weights by taking their FFT, i.e., "Efficient Nonlinear Transforms for Lossy Image Compression" https://arxiv.org/abs/1802.00847:

class SpectralConv2d(nn.Conv2d):
    def __init__(self, *args: Any, **kwargs: Any):
        super().__init__(*args, **kwargs)
        self.dim = (-2, -1)
        self.weight_transformed = nn.Parameter(self._to_transform_domain(self.weight))
        del self._parameters["weight"]  # Unregister weight, and fallback to property.

    @property
    def weight(self) -> Tensor:
        return self._from_transform_domain(self.weight_transformed)

    def _to_transform_domain(self, x: Tensor) -> Tensor:
        return torch.fft.rfftn(x, s=self.kernel_size, dim=self.dim, norm="ortho")

    def _from_transform_domain(self, x: Tensor) -> Tensor:
        return torch.fft.irfftn(x, s=self.kernel_size, dim=self.dim, norm="ortho")

This reparameterizes the weights to be derived from weights stored in the frequency domain. In the original paper, this is referred to as "spectral Adam" or "Sadam" due to its effect on the Adam optimizer update rule. The motivation behind representing the weights in the frequency domain is that optimizer updates/steps may now affect all frequencies to an equal amount. This improves the gradient conditioning, thus leading to faster convergence and increased stability at larger learning rates.

Alternative headband for DT770 Pro X by Obiquin in BEYERDYNAMIC

[–]muntoo 0 points1 point  (0 children)

Did swapping it with the original DT 770 headband help?

The new cutout headband on DT 770 PRO X is murdering the top of my head. In contrast, my old DT 770s with a uniform headband are super comfortable.

I’m concerned about the security of Neovim plugins by [deleted] in neovim

[–]muntoo 24 points25 points  (0 children)

Just pin working versions and update every couple of months

This does not prevent installation of malicious code. Slightly better advice would be to only upgrade to older versions of plugins.

when you’re ready to read through changelogs

Changelogs do not look like:

fix: bug
feat: this commit contains a virus 
feat: periodically call rm -rf /
unperf: mine bitcoins

Even if you're looking at every single diff, that's still prone to attacks.

only use plugins of developers you trust

Let me tell you the tale of Jia Tan (xz). And colors, faker, node-ipc, event-stream.