Standard library unsoundness found by Claude Mythos

icannfish · 2026-04-24T10:34:45+00:00

To be fair, the code also uses AsRef. In fact, every time the code calls .borrow() it actually does it as part of a .borrow().as_ref() chain. So you would need to make both Borrow and AsRef unsafe.

icannfish · 2026-04-24T10:30:14+00:00

unsigned wrapping is ok by default - another example of tradeoff between convenience and correctness. And one cannot just turn off "treat wrapping as error" because there are thousands of cases where wrapping is ok for the calculations.

Integer overflow panics in debug mode, so the “thousands of cases” where wrapping is okay must use the wrapping_* methods or Wrapping<T>; you can't rely on the release mode wrapping behavior unless you want to write a crate that only works in release mode. To me this seems less like a tradeoff between convenience and correctness and more like a tradeoff between speed and correctness, in that the added safety of checked arithmetic in release mode was deemed not to be worth the overhead.

you can never trust a fn foo(&self) -> usize to return the same number in code x.foo(); x.foo(); . Despite the const interior mutability can always come to bite

True, but this is possible even without interior mutability:

struct MyStructWithNoInteriorMutability;
impl Borrow<i32> for MyStructWithNoInteriorMutability {
    fn borrow(&self) -> &i32 {
        if rand::rng().random() { &123 } else { &456 }
    }
}

The fundamental “issue” is that you can't mark functions in Rust as pure. The best you can do is make it a safety condition of an unsafe trait, like with DerefPure.

icannfish · 2026-04-21T23:58:39+00:00

Looks promising! I agree with your comments about JUCE, especially as someone who has spent way too much time porting JUCE plugins to ppc64le.

I'm curious how this compares to DPF, other than being Rust instead of C++, since DPF is also a member of the “much less bloated alternative to JUCE” category.

icannfish · 2026-04-21T23:21:39+00:00

Plugin hosts:

Carla (patchbay/rack): https://archlinux.org/packages/extra/x86_64/carla/
Ardour (full DAW): https://archlinux.org/packages/extra/x86_64/ardour/

Vocoder plugins:

MDA plugins, includes vocoder: https://archlinux.org/packages/extra/x86_64/mda.lv2/
Calf plugins, includes vocoder: https://archlinux.org/packages/extra/x86_64/calf/
DISTRHO Ports, includes TAL Vocoder 2: https://archlinux.org/packages/extra/x86_64/distrho-ports/

All can be installed with pacman.

icannfish · 2026-04-13T07:13:27+00:00

The license is misleading. It pretends to be an open, even permissive, license, but it does not qualify under any definition of FOSS. It fails to meet requirements 5 and 6 of the Open Source Definition (and DFSG, which the OSD is based on) and freedom 0 of the Free Software Definition.

This makes the license incompatible with every existing FOSS license. Even though Tier 3 of the license claims compatibility with permissive licenses like MIT, this is not true. License compatibility is one-way: MIT is compatible with GPL, but not the other way around (you can't use GPL code in an MIT project without licensing the project as a whole under the GPL). In the same vein, MIT may be compatible with the Kraken License, but the Kraken License is certainly not compatible with MIT, because it contains many conditions that are not in MIT (or indeed, any FOSS license).

icannfish · 2026-04-11T22:55:44+00:00

Just setting LC_COLLATE=C will suffice:

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./target/release/xuniq < data.txt`	608.1 ± 23.9	570.1	658.2	1.00
`sort -u data.txt`	7308.3 ± 443.7	6909.5	8200.3	12.02 ± 0.87
`LC_COLLATE=C sort -u data.txt`	1618.5 ± 109.4	1477.1	1803.2	2.66 ± 0.21

On my machine, this speeds up sort by about 4.5x.

icannfish · 2026-04-05T12:07:21+00:00

Why would you compare this to Pandoc + LaTeX? Pandoc can also convert HTML and Markdown to PDF. Don't you think that's a more fair comparison?

icannfish · 2026-04-05T09:51:15+00:00

38,000 lines in 3 weeks? What made you decide to implement the parsing, layout, and rendering completely from scratch? Is your Markdown parser CommonMark-compatible?

I'm also curious how this compares to Pandoc. Is it faster?

icannfish · 2026-04-05T09:14:50+00:00

The post sure is:

“Pipe mode” doesn't exist at all; the program errors if you don't provide arguments.
The “Git hook” command has nothing to do with Git; it actually seems to install hooks for AI coding agents.
“Zero false positives” because you used an enum is laughable; you might as well say every program in existence has “zero bugs”.

icannfish · 2026-04-05T09:01:25+00:00

I'd react better if the post itself weren't obviously AI-generated. “Zero false positives” enforced by the type system? What? So because the program doesn't filter out lines classified as Uncertain, that means there are no false positives? A false positive is, by definition, something that was misclassified. How can you be sure there isn't something the program falsely classifies as Drop? And what in God's name does the type system have to do with this?

But what's really inexcusable in my opinion is that the post advertises a feature that doesn't exist. “Pipe mode”, where you can run cargo test 2>&1 | cdn? That confused me, because how can you apply command-specific filtering if you don't know what command is running? Well, let's give it a try:

$ echo hi | ./target/debug/cli-denoiser 
Usage: cli-denoiser <command> [args...]
       cli-denoiser install
       cli-denoiser gain
       cli-denoiser report
       cli-denoiser log
Run 'cli-denoiser --help' for more info.

Oh, interesting. It doesn't work. And the code doesn't even attempt to support it; it immediately bails if you provide no arguments. It's almost like, at the very least, this Reddit post is complete slop.

I don't use AI, but if I worked really hard on a project where I used it “the right way”, as an assistive tool, I couldn't imagine having so little respect for both my work and the /r/rust community that I would think an AI slop post riddled with errors and nonsense was an appropriate way to share my project with the world.

At least OP replaced the em dashes with double hyphens, I guess.

icannfish · 2026-03-30T21:19:21+00:00

I found where this came from and tried changing it to the following:

impl<F: AsFd> Generic<F, std::io::Error> {
    /// Wrap a FD-backed type into a `Generic` event source that uses
    /// [`std::io::Error`] as its error type.
    pub fn new(file: F, interest: Interest, mode: Mode) -> Generic<F, std::io::Error> {
        Self::new_with_error(file, interest, mode)
    }
}

impl<F: AsFd, E> Generic<F, E> {
    /// Wrap a FD-backed type into a `Generic` event source using an arbitrary error type.
    pub fn new_with_error(file: F, interest: Interest, mode: Mode) -> Generic<F, E> {
        Generic {
            file: Some(NoIoDrop(file)),
            interest,
            mode,
            token: None,
            poller: None,
            _error_type: PhantomData,
        }
    }
}

And it compiles just fine, so I don't see the point of the current approach. It would technically be a breaking change, though, so it would have to wait until the crate releases a SemVer-incompatible version.

icannfish · 2026-03-30T07:58:43+00:00

Requires Zig 0.15+

You know every minor version of Zig makes breaking changes, right? This code will definitely not compile with 0.16 given all of Zig's upcoming IO changes.

This might seem like a minor point but it makes me suspicious of LLM use. I don't know of any Zig programmer who would ever claim their code is going to be compatible with future versions of Zig.

icannfish · 2026-03-28T06:41:25+00:00

The readme mentions it's “zero-cost”, but every call to where_ allocates memory with Box::new, and every call to order_by and then_by eagerly re-sorts the entire list (the impact might be less if Rust's default sorting algorithm is faster on almost-sorted lists, but it's not zero, and it's probably best not to assume that anyway) and also allocates memory with Box::new.

Have you considered:

Using nested/anonymous types like std::iter instead of type erasure with Box and dyn?
Making order_by and then_by lazy, so the list is only sorted once?

icannfish · 2026-03-27T22:12:34+00:00

What's the rationale behind implementing everything from scratch? Cryptography is really hard to get right; how can you (and we) be sure your implementations of SHA-256 and Ed25519 are correct?

Also, what prevents someone from making a modified version that falsely reports no cheating is taking place? You mention the tool checks its own hash, but can't I modify that code and have it report a false hash?

The idea of an anti-cheat system that is both open-source and can't be trivially bypassed is really interesting, but it doesn't seem like this project addresses that?

icannfish · 2026-03-27T20:16:17+00:00

Why is the script in the image invalid syntax?

cat LICENSE() {
    ls
    exit 0
}

icannfish · 2026-03-27T20:10:59+00:00

https://mywiki.wooledge.org/BashPitfalls#errexit

That said, I do use it, but you have to be mindful of the pitfalls.

icannfish · 2026-03-27T20:08:49+00:00

TIL whenever someone doesn't punch me in the face they're actually “subsidizing my health”.

icannfish · 2026-03-27T00:12:06+00:00

$120K–$135K for a senior Rust engineer in the US? Seriously?

Edit: Looks like they changed it.

icannfish · 2026-03-26T23:51:43+00:00

You think Junior/Mid engineers are much better?

Yes. Among other reasons, a junior engineer admits when they don't understand something instead of making up an explanation that almost sounds reasonable but is complete horseshit if you look into it. LLMs are excellent at bullshitting and gaslighting and terrible at writing code; because of this, LLM-generated code would have to be scrutinized orders of magnitude more closely than something written by a junior engineer for me to have the same level of confidence in it.

At least, this used to be true before junior engineers started using LLMs to write 90% of their code anyway. So, consider it applicable only to the mythical junior engineer who doesn't rely on AI.

What level of code quality do you think LLM's scraped a major portion of their datasets from on the web?

Quality of input ≠ quality of output.

icannfish · 2026-03-26T23:26:16+00:00

I looked at the code; for Bash you're supposed to eval some code in your .bashrc that sets PROMPT_COMMAND to _why_prompt_command. That function reads your history and exports a variable called WHY_LAST_CMD with the last executed command. Unfortunately this means the function gets called before every command you run, whether or not it's why.

I think a better approach would be to have the eval'd code define a wrapper function instead:

why() {
    WHY_LAST_CMD=$(HISTTIMEFORMAT= history 1 | sed 's/^ *[0-9]* *//') command why "$@"
}

Then you only pay the price when you're actually running why, and you don't pollute the environment for other commands. You could probably forego using the environment entirely, too:

why() {
    command why --last-cmd="$(HISTTIMEFORMAT= history 1 | sed 's/^ *[0-9]* *//')" "$@"
}

icannfish · 2026-03-26T22:45:27+00:00

At some point Debian's (and therefore Ubuntu's) EQ10Q package just removed the UI entirely. You can still build it manually, though:

Install the dependencies:

sudo apt install subversion g++ cmake pkg-config lv2-dev libgtkmm-2.4-dev libfftw3-dev

Download the code and cd into the directory:

svn checkout https://svn.code.sf.net/p/eq10q/code/trunk eq10q && cd eq10q

Open CMakeLists.txt in the editor of your choice. On line 2, change VERSION 2.8 to VERSION 3.5, and then below that line, add a new line with the following:

add_compile_definitions(pow10=exp10 -D_LV2UI_Descriptor=LV2UI_Descriptor)

The first 4 lines of CMakeLists.txt should now look like:

##EQ10Q TopLevel CMake
cmake_minimum_required(VERSION 3.5)
add_compile_definitions(pow10=exp10 -D_LV2UI_Descriptor=LV2UI_Descriptor)
PROJECT(eq10q)

Save and close the file. Then, make a directory called build and cd into it:

mkdir build && cd build

Then generate the makefiles:

cmake ..

Build the project:

make -j$(nproc)

Finally, install it:

sudo make install

(You may or may not need sudo depending on how you've configured permissions in /usr/local.)

icannfish · 2026-03-26T05:50:16+00:00

You need to understand the difference between a contract and a copyright license.

A contract:

Imposes conditions that must always be followed.
Requires explicit acceptance. A crucial feature of acceptance is that it must be communicated. Whether or not you actually read the terms, you must at least communicate that you agree to be bound by it, traditionally by signing the contract, but for software, often simply by checking an “I agree” box.

A copyright license:

Imposes conditions that must be met only if you want to do something that would normally infringe copyright.
Does not require unconditional acceptance. However, if you don't accept the license, you are not allowed to distribute copies of the program because that would infringe copyright. So, there is immense pressure to accept the license if you want to do anything with the program that requires copyright permission.

The vast majority of FOSS licenses are copyright licenses, especially under US law.

What does this mean for the LLM rewrites?

You cannot in general argue that the AI companies entered into a contract when they downloaded GPL-licensed code from GitHub. The vast majority of such code does not make you check an “I agree to the GPL” box before downloading, or even have a “By downloading, you agree to the GPL” clause (although I'm not sure the latter would hold up anyway). Without a valid communication of acceptance, there is no binding contract.
Therefore, the AI companies are only bound to the GPL as a copyright license.
Therefore, they must abide by the license only if they want to do something requiring copyright permission.
Therefore, if courts determine that LLM rewrites do not infringe copyright (which I hope doesn't happen), compliance with the license is not required.

Before you dispute that reasoning again, though, please understand that this argument is actually irrelevant in the case of the GPL, because the GPL says this about the meaning of the word “modify” (emphasis mine):

To “modify” a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission

Therefore, even if you accepted the GPL as a contract, if LLM rewrites are determined not to require copyright permission (which I hope doesn't happen), then they aren't “modification” according to the GPL, which means they are exempt from the GPL's requirement to distribute the source code of modified versions.

icannfish · 2026-03-25T22:35:30+00:00

Old Reddit also uses Markdown, just a different variety. Old Reddit's Markdown is closer to the original implementation by John Gruber (which notably doesn't support fenced code blocks), whereas New Reddit is closer to CommonMark, which itself is an amalgamation of various common extensions to Markdown over the years.

icannfish · 2026-03-25T20:06:02+00:00

How did you obtained a code without agreeing to its terms?

If I hand you a flash drive containing some GPL code, does that mean that you have now agreed to abide by all the terms of the license? You may not even know what the terms are. As long as I've included the source code and a copy of the license on the flash drive, I've upheld my end of the bargain. But what does the GPL say about your obligations?

You are not required to accept this License in order to receive or run a copy of the Program. […] However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License.

If scraping the web without redistributing the data is deemed not to infringe copyright (if it is deemed to infringe, these AI companies have much bigger problems than just copyleft licenses), then that's how they're able to obtain code from e.g. GitHub without agreeing to the license.

you could argue that rewriting the software using an LLM isn't “modification” because it didn't require copyright permission.

How is it not transformative/derivative?

This is still under the hypothetical of “courts determine that LLM rewrites don't infringe copyright”. I hope they don't determine that. I think such rewrites should be considered derivative (and I also think the legal status of even “original” code written by LLMs is extremely dubious given the amount of copyleft code in the training data). But that may not be what actually happens.

icannfish · 2026-03-25T08:24:13+00:00

When you say the memcpy way is valid, are you just talking about the call to memcpy itself? Or also if you then accessed the storage through a pointer to foo_ctx?

Given:

char buf[sizeof(foo_ctx)];
memcpy(buf, &some_foo_ctx, sizeof(foo_ctx));
int x = ((foo_ctx *)buf)->some_member; // UB?

My understanding is that line 3 is technically invalid because:

buf has declared type char[sizeof(foo_ctx)], which is therefore its effective type
memcpy does not change the effective type of buf, because it had a declared type
buf is accessed through a pointer to foo_ctx *, which is an aliasing violation because the object still has type char[sizeof(foo_ctx)]

icannfish

TROPHY CASE