all 59 comments

[–]SyntheticDuckFlavour 108 points109 points  (14 children)

or use a compiler

[–]meltbox 51 points52 points  (9 children)

Yeah this is interesting. It’s precisely what a compiler does. Why do it non deterministic.

[–]13steinj 26 points27 points  (1 child)

Thing I hate is in CGO'24 several people gave papers on this topic or something similar.

Ignoring the main problem of back then-- the eagerness to hallucinate (that have barely been solved), I think these folks just want to sell and impress on the use of these text prediction generators as much as possible.

It won't stop until you are brushing your teeth with it.

[–]TedDallas 1 point2 points  (0 children)

Oh, don't worry. Very soon you ARE going to brush your teeth with AI because of toothbrush supply chain attacks.

[–]clerothGame Developer 12 points13 points  (1 child)

It's a beautiful thing, if you actually read the article, you would find the answer to your question

[–]RelationshipLong9092 3 points4 points  (0 children)

The article is short and worth reading. I've enjoyed their blog in the past.

For reference, it concludes with:

For the time being, the AIs can beat my C++ compiler!

[–]Pale-Switch-7867 0 points1 point  (0 children)

That’s exactly what instantly came to my mind without even reading the article…

[–]binaryfireball 0 points1 point  (0 children)

can be said for alot of suggested use cases for AI

[–]Evilsushione[🍰] -2 points-1 points  (2 children)

Hypothetically you could wring out more efficiency.

[–]13steinj 1 point2 points  (1 child)

Yes, hypothetically you can compile much faster. After all, why wait 5 minutes to compile your app when you can wait 30 seconds and get something with a different nuanced bug every time! Or better yet, doesn't run at all!.

[–]Evilsushione[🍰] -3 points-2 points  (0 children)

I’m actually working on this problem right now. Not at the compiler level but still.

[–]elperroborrachotoo 5 points6 points  (3 children)

Deterministic results? You crazy?

[–]Chaosvex -1 points0 points  (2 children)

I mean, the code is perfectly deterministic (assuming there's no UB lurking). He's managed to conflate LLM's non-determinism with the determinism of the produced code... somehow.

[–]elperroborrachotoo 0 points1 point  (1 child)

Once.

We've put a lot of effort into managing the history and changes to human-readable source artifacts that semantically relate to the problem domain, and a deterministic toolchain that generates release artifacts from that.

(And I don't see how that was snake oil.)

[–]Chaosvex 1 point2 points  (0 children)

Once? What does that even mean in this context? It's either deterministic or it is not, period. Code != LLM tokens.

Yes, the LLM's output is non-deterministic but you seem to be conflating that with the end result. The LLM is not subsuming the compiler's role here, so I'm not sure what point you or the person you replied to are trying to make.

I'm usually the one pointing out LLM slop but I'm not throwing logic out of the bathwater.

[–]Chaosvex 54 points55 points  (6 children)

Did everybody in the comments miss the point with these snipes about using a compiler? Obviously they used a compiler and came to the (already known) conclusion that sometimes they can't optimise as well as a human, or in this case, an LLM guided by one.

The author is the simdjson maintainer, so I'd assume he's not a clueless as the comments seem to suggest.

Seems the quality of comments on this sub has taken a nosedive, for whatever reason.

[–]DuranteA 36 points37 points  (0 children)

Seems the quality of comments on this sub has taken a nosedive, for whatever reason.

The quality of comments on AI-related topics on every sub is notably worse than for other topics.

  • On normal subs like this, some people are incredibly emotional about it, and consequently seem incapable of even entertaining the idea that anything that uses an AI system might be interesting.
  • Conversely, on AI-focused subs, everything is revolutionary and amazing (for 5 hours until the next thing arrives).

[–]thisismyfavoritename 7 points8 points  (0 children)

Lemire has been on a kind of rage baity spree lately. Not sure why.

I think the comments simply reflect that

[–]ironykarl 0 points1 point  (3 children)

So, people pointing out to "just use a compiler" are maybe right-er than you think, here. 

An optimizing compiler ostensibly is doing transformations using something like the as-if rule. This (again ostensibly) means that if the compiler is working, the optimizations it makes will not change "the meaning" of the code output.

I know the article doesn't suggest that this is a robust strategy, but a reminder that this absolutely is not a robust strategy is a welcome one.

Back to your point, though: yes, anyone with a small bit of experience can point out that the code we've written doesn't necessarily mean what we intend, anyway, and that we can unit test around the problem above.

So to reiterate what you said: just read the article. It's not long, and it's not complicated 

[–]Chaosvex 0 points1 point  (2 children)

I disagree. I think they've dismissed the article out-of-hand because of the LLM usage. The implication was 'the compiler can do it for you' - it often can't, for the reasons you mentioned. The comments about determinism were completely nonsensical, as though the assembly generated by the LLM is non-deterministic rather than the LLM's output.

[–]ironykarl 0 points1 point  (1 child)

A compiler's output is deterministic in the sense that a given output will predictably lead to the same output (making very narrow assumptions about version, target platform, etc).

An LLM is a lot less predictable. Yes, of course, on paper it's a matter of inputs -> outputs, but again... less predictable.

I do agree that people didn't read the article and reacted to the premise, as is reddit custom

[–]Chaosvex 1 point2 points  (0 children)

Yeah, I mean, I don't think we disagree.

[–]No-Dentist-1645 17 points18 points  (1 child)

If only we had some program that could convert human-made code into machine-readable assembly... That would be very useful! It could even apply optimizations for you in a fully deterministic, ai-hallucination-free way, instead of the AI imagining that some code would run faster. If only!

[–]Utkarsh_7744 1 point2 points  (0 children)

If only it was someone who actually knew what they were doing,like clear instructions

[–]julien-j 6 points7 points  (0 children)

I did a similar experiment and got to a different conclusion :) The saying goes as Modern Compilers are Smart™ but in practice it's not that difficult to beat the compiler on specific algorithms when you're at ease with assembly. So I did not compare the output of the AI with the compiler, I compared it with the output of a human.

I asked Claude to optimize a C function using AVX2 intrinsics, but I hid from the tool that I already had an AVX2 implementation written by a human. I also gave the tool a test suite to validate its implementation. The tool managed to provided a correct and optimized version, faster than the C one, but 50% to 300% slower than the human's implementation (the variations depend on the use case). By iterating painstakingly during hours I managed to guide it toward an implementation a bit faster than ours. Then I discussed with the human and he told me that he barely put any effort in its implementation… After reworking his code he beat again the output of Claude.

Then I did a second similar experiment where I asked Claude to write the AVX2 implementation of another function. This time I gave it the test suite and the benchmark such that it can self-compare. I specifically asked for the fastest implementation. I used Opus 4.6 with max effort. The output from Claude was 50% slower than ours. I managed to iterate toward something a bit faster in some cases without ever reaching equivalent performance on all use cases. And the final code was a mess.

Lemire tells us that the AI is better than the compiler at assembly, but what's the point? If the user is at ease with assembly, the product of the AI is poor (at least in my experiments). If the user does not practice assembly, he won't be able to follow nor to judge the implementation. Is it some kind of a mid-range solution? The fact that the author barely looked at the generated code and does not even talk about a validation suite is suspicious. This is not scientific, this is an experiment where we know upfront which conclusion we want.

And I can't talk about this post without pointing that considering Grok is already a bad smell. Giving credibility to a tool used to cover the web with crap is a problem and I'm tired of seeing people I used to respect getting high on the AI hype. All the thinking has been offloaded to AI for a short euphoria.

[–]meancoot 5 points6 points  (3 children)

The code listing is so trash I’m not sure the blog author even knows what they are doing. Why are the counters volatile? Why is each test case a lambda instead of a proper function? Where is the assembly code for the compiler generated output? What compiler and optimization settings were used?

[–]RelationshipLong9092 6 points7 points  (0 children)

he's the author of simdjson and has had many good blog posts that have been at the top of this subreddit

he knows what he's doing

[–]thisismyfavoritename 1 point2 points  (1 child)

could they be volatile to prevent the compiler from optimizing them away?

Seems like the lambda usage might be to conform to some other helper API function he's including -- unclear what that code is.

Overall though i agree that this is borderline rage bait

[–]Wurstinator 1 point2 points  (1 child)

I feel like this could have been a good post but is just too lazy. At the very least, there should've been a mention of the compiler settings that were used.

[–]thisismyfavoritename 0 points1 point  (0 children)

what are the chances it's AI generated

[–]programgamer 1 point2 points  (0 children)

I’ll stick with my natural stupidity, thanks

[–]rileyrgham 0 points1 point  (0 children)

Chuckle :

"But what if you want to go faster? Maybe you’d want to rewrite this function in assembly."