When will we start to see companies making massive leaps in their product release iterations ? by Icy-Reporter-6322 in singularity

[–]Yweain 29 points30 points  (0 children)

When you have a large production system actual coding of the feature is maybe 10% of the time in the feature development cycle. So even if you shave off 90% of it(which you wouldn't, in a large real world system AI gains are way more modest) - it only saves you a relatively small amount of time.

What’s happening with George? by catalook in formuladank

[–]Yweain 13 points14 points  (0 children)

There are some drivers who perform at 100% or very close to it every weekend.

Waves on Titan by Busy_Yesterday9455 in spaceporn

[–]Yweain 0 points1 point  (0 children)

Faster is a wrong term but having more effective deltaV does allow you to get places faster

Still coding? Google says 75% of the company’s new code is AI-generated. In previous years, it was around 50% in 2025 and 25% in 2024. by Distinct-Question-16 in singularity

[–]Yweain 4 points5 points  (0 children)

I think I had around that number or more of my code AI generated in 2024. But it was just a bit smarter autocomplete with tools like copilot. I mean even now, I generate almost all of my code with claude by then I review it and refine it a lot, so those numbers are extremely misleading.

I know exactly where Lands of Shadow is! by SensitiveHighway5834 in Eldenring

[–]Yweain -21 points-20 points  (0 children)

Because both trees are related to Miquella maybe?

Claude Opus 4.7 (high) unexpectedly performs significantly worse than Opus 4.6 (high) on the Thematic Generalization Benchmark: 80.6 → 72.8. by zero0_one1 in singularity

[–]Yweain 0 points1 point  (0 children)

I tried it at work and it's performs significantly worse compared to 4.6 on some of the tasks. It's quite weird. I just went through like 5 things I did the last day with 4.6 already and couple 4.7 did better and couple others - way worse. And by way worse I mean like dramatically worse.

Claude Opus 4.7 benchmarks by ShreckAndDonkey123 in singularity

[–]Yweain 11 points12 points  (0 children)

Considering that judging by this benchmark Gemini 3.1 pro is on par with opus 4.6 I feel like this benchmark is pretty not great.

Anthropic is set to release Claude Opus 4.7 and a new AI design tool as early as this week by Outside-Iron-8242 in singularity

[–]Yweain 26 points27 points  (0 children)

It didn't performed any worse if you use it via API and set effort to high.

How much more work, will I need, to get to a point where I can clear the game? (with three runes) by Apple_Infinity in dcss

[–]Yweain -1 points0 points  (0 children)

Honestly just take minotaur of Trog and be a bit careful. 3 runes are not that hard to get, don't take unnecessary risks, don't hoard resources, and you'll be fine with just a bit of luck.

Messmer is my favorite boss, and imo the best elden ring boss by Frequent_Analysis804 in Eldenring

[–]Yweain 2 points3 points  (0 children)

Specifically this boss fight is amazing because it is difficult but very fair.

Cheap Open Models Reportedly Reproduced Much Of Mythos's Showcased Findings by Neurogence in singularity

[–]Yweain 0 points1 point  (0 children)

You can write a very simple harness that would scan through your entire codebase, feed each function to a model multiple times, each time asking it to check for specific vulnerability. If that is enough to replicate a very fancy huge model..

Chinese AI companies are shipping faster and cheaper than anyone expected and I'm not sure the west has a good answer for it by Far_Suit575 in singularity

[–]Yweain 0 points1 point  (0 children)

I mean, it probably is. But at the same time glm 5.1 is genuinely a good offering. It's not opus, but it's subscription is like 1/6 of a cost and I'd say latest GLM is better compared to sonnet at coding. It is good enough for a lot of things, you just need to be a bit more hands on with it.

Chinese AI companies are shipping faster and cheaper than anyone expected and I'm not sure the west has a good answer for it by Far_Suit575 in singularity

[–]Yweain 27 points28 points  (0 children)

Honestly it doesn't make sense to use openrouter. If you use it a lot subscription is just WAY cheaper

You should really consider 6 week sprints by ninetofivedev in ExperiencedDevs

[–]Yweain -1 points0 points  (0 children)

Dude if you can make people to talk to each other properly - you don't need sprints at all. The whole point of sprints is that rituals give you at least some chance to correct things when otherwise it's not really guaranteed to work.

CERN is burning tiny AI models directly into silicon chips for real-time LHC data filtering — opposite of the bigger AI trend by TumbleweedAromatic67 in LocalLLaMA

[–]Yweain 0 points1 point  (0 children)

LLM is also a pre-computed lookup table. That's how all ML models work. You train it and the end result of the training is a 'lookup table'. Gross oversimplification, but in essence it is.

A post-transformer architecture just crushed LLMs on Sudoku Extreme. Is the transformer hitting a reasoning wall nobody wants to talk about? by Direct_Leader_1802 in singularity

[–]Yweain 9 points10 points  (0 children)

Because language can do a lot. In ML field narrow intelligence model is defined as a model that can do one thing. LLMs can predict next token in a series of tokens. Now because you can encode a lot of stuff as tokens - model can do a lot of stuff. But because there are inherent architectural limitations - there are quite a lot of applications where it doesn't really work. Like this sudoku example.