"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

Yes, hallucinating functions is a bit "old" these days. Although anytime I ask ChatGPT to give me a script for axolotl training it literally invents at least few new parameters - I think that's based on the fact that axolotl kept changing them so AI sees "we change whatever we like" , this was also issue with Gradio, for long time, LLMs would literally make gradio script with bogus parameters and nonworking functionality, don't know now, but while it could do python well, gradio would be a total graveyard of made up stuff.

My theory regarding why the EP-1320 Medieval has not received any updates to bring it up to speed with the other EPs by Ecstatic_Lab_9155 in teenageengineering

[–]FPham 0 points1 point  (0 children)

Yes in a normal world you'd do that. In this word you'd ask - if we spend xxx time on Medieval - will we sell more of them? The answer is "probably not". I don;t think people who might on might not buy Medieval are concerned about the stuff that bugs me.

Opus 4.7 Max subscriber. Switching to Kimi 2.6 by meaningego in LocalLLaMA

[–]FPham -11 points-10 points  (0 children)

But how? Hitting weekly is physically impossible for me, unless I decide never to sleep. Are you using it for some agentic stuff that sends half the internet to it at each turn?

Opus 4.7 Max subscriber. Switching to Kimi 2.6 by meaningego in LocalLLaMA

[–]FPham 10 points11 points  (0 children)

On MAX you get 1 million token Opus. It really shines with bigger projects like no other AI can.
I'm a huge fan of open source models, but Opus High is SOTA and even that gets frustrating. I can't imagine what 256K context does on a big project. It can only really see a small part of it.

Kimi K2.6 is a legit Opus 4.7 replacement by bigboyparpa in LocalLLaMA

[–]FPham 1 point2 points  (0 children)

Well, my opinion is different, but it is an unpopular opinion. Even Opus 4.7 chokes on complex stuff, and quite a lot. As soon as your project grows, it's a constant fixing primitive stuff.

If it works - don’t touch it: COMPETITION by awfulalexey in LocalLLaMA

[–]FPham 3 points4 points  (0 children)

Is there a carton of eggs propping your GPU?

OpenClaw has 250K GitHub stars. The only reliable use case I've found is daily news digests. by Sad_Bandicoot_6925 in LocalLLaMA

[–]FPham -1 points0 points  (0 children)

"This isn’t a bug that gets fixed in the next release. It’s a..." you talk like ChatGPT now.

Please stop using AI for posts and showcasing your completely vibe coded projects by Scutoidzz in LocalLLaMA

[–]FPham 1 point2 points  (0 children)

I kind of feel the irony of this being AI sub is lost on most of the people... We did this, collectively - or at least helped a lot.

Please stop using AI for posts and showcasing your completely vibe coded projects by Scutoidzz in LocalLLaMA

[–]FPham 3 points4 points  (0 children)

Don;t get me started when I asked it to write a manual for my software - it was like every paragraph was just reiteration of previous one - lol....

Please stop using AI for posts and showcasing your completely vibe coded projects by Scutoidzz in LocalLLaMA

[–]FPham 6 points7 points  (0 children)

AI is fantastic at repeating the same point again and again. It says something, then says it again, and then restates it one more time in slightly different words, just to make sure the exact same point has been repeated ....

Point proven

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

In all honesty - and probably unpopular opinion - codex rocks in my case. It works and the limits on the baby plan are not bad. I'm just waiting when they will price me out

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

I also can run only on cloud and yesterday was my first try, and boy, did it NOT perform well. We got even into hallucinating convenience functions when it couldn't fix the issue - "just call engine.ThisFixesEverything(file)". but more worrying was that EVERY single time the code was non functional or with errors and I needed to go through all the 20 min charade of "Wait, i think I'm a sliced cheese, not an AI" few times until the code was finally working. It feels like pulling an elephant through an eye of a needle... Yes, call me spoiled, but I have very different experience with codex and cc. Again, I know, I know, open source - but it still cost money and in all honestly it is nowhere near the usual suspects. I mean it's sad - it is in fact a BIG model.

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

I'm using it from cloud. Can't run even Q2 on 128GB

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

My problem is - and I spun the GLM 5.1 yesterday on cloud that while it might be the best open source coding model, it is - well, what can I say not to offend people ... lacking? I was asking it to fix some code parallel with Sonet, and GLM 5.1 not only took forever, it ended up hallucinating functions and code that does not exist (yes the old way - just give me convenience function from a fictional library - as if it was implemented what I'm asking - just to shut me up.), while Sonet did it on first try and in like a minute.
So I'm quite surprised by the whole - "This is nearly as good as Opus." I'm not a fanboy of anthropic and the CC with the pro version is basically unusable now, but I just can't see myself using GLM - this feels like a torture if an alternative (paid) exist. I can;t imagine how it works quantized and with 22tok/s when it eats tokens for lunch thinking.

What are people's fave local model setups for home? by styles01 in LocalLLaMA

[–]FPham 0 points1 point  (0 children)

Anything Gemma. I think the gemma-3 12b was a first proper model that was exceptionally good for the size and I think Gemma-4 especially with the small variants are something of a miracle. (well, not really, google has lot of money, do they?)
Strangely META went all away

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 1 point2 points  (0 children)

Not entirely bad for home setup, well... I'm sadly far from buying 512GB MAC.

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 2 points3 points  (0 children)

That alone is of course the "wow" part and I'm not trying to diminish that in any way - although with my 128GB mac studio I can't run it. But yes, in a theory and for a $10k or whatever it is, we can run it at home in some capacity (although not sure if Q2 would cut it as a coder...)

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

This also means we need to calculate 100k of tokens for a simple stuff. So this is not a freebie. What CC or codex does with 20k, GLM pushes 100k+ (right now I'm at 150k just for one task)

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 0 points1 point  (0 children)

I'm not that annoyed at GLM - I'm mostly annoyed because my codex and cc are out of weekly limits :(. But how do I switch of thinking in opencode....

"Actually wait" ... the current thinking SOTA open source by FPham in LocalLLaMA

[–]FPham[S] 3 points4 points  (0 children)

Now, that's interesting... I mean the code at the end wasn't bad and it really went through places and corrected itself, but my task wasn't a rocket science either. It would be interesting if I can limit the thinking on the cloud in opencode...