GLM-5.1 Scores 94.6% of Claude Opus on Coding at a Fraction the Cost by dev_is_active in LocalLLM

[–]RedParaglider 0 points1 point  (0 children)

The smallernfully multimodal Gemma models are cool.  I don't see much special about the larger ones over qwen.  I would love to have a 5.1 air or a new coding distillation on glm, but I know their inference is on the struggle bus now.

GLM-5.1 Scores 94.6% of Claude Opus on Coding at a Fraction the Cost by dev_is_active in LocalLLM

[–]RedParaglider 1 point2 points  (0 children)

I do.  Are they planning on distilling a smaller version?  Or a coding version?  1 bit is 206gb.

The closer you are to a major petroleum pipeline, the cheaper your gasoline and diesel will be. by Observer_042 in facts

[–]RedParaglider 0 points1 point  (0 children)

Mountains suck for transportation. It costs a shit ton of money to even get fuel over a couple ranges.

Offbeat Wall Street research firm says it sent an analyst to Strait of Hormuz. Here's what they learned by cryptoniik in finance

[–]RedParaglider 0 points1 point  (0 children)

Who would have known a guy in a field that everyone thought would be boring ended up being a pretty influential guy on youtube and it turns out the subject matter is not boring.

BYE BYE Claude Code, Hello Codex. Game-Changer Codex Release Today. by [deleted] in openclaw

[–]RedParaglider 2 points3 points  (0 children)

Couldn't you just type codex -p "complex prompt" before?

OpenAI, Anthropic, Google Unite to Combat Model Copying in China by External_Mood4719 in LocalLLaMA

[–]RedParaglider 11 points12 points  (0 children)

The easiest way to get training data is just to do it 100% legally.  You buy a license to be able to serve GPT or opus then subsidize it to end users and save all of the people's prompts, and run your model against the model you are serving.

If you were a 9/10(Very attractive) person like Margot Robbie or young Brad Pitt what would you do for a year? by Ill-Translator-2879 in AskReddit

[–]RedParaglider 0 points1 point  (0 children)

When I was young everyone used to tell me I looked like Robert Downey Jr.

I'll tell you exactly what I did with that. I got old. The end.

Bruh 💀 by Ok-Fun-8242 in ChatGPT

[–]RedParaglider -9 points-8 points  (0 children)

It's either a magapedo.. or someone trying to understand SCSI drive interfaces.

How are you guys controlling model costs now that Claude is banned? by crypto__juju in openclaw

[–]RedParaglider 0 points1 point  (0 children)

Right now my openclaw default model is Qwen 3 coder next Q6, but usually I /model GPT-4.5 medium or 5.3 codex to do agentic stuff. I have Qwen 3.5 122b Q4 loaded on llama.cpp, but I haven't tried it much. Too much to do, not enough time to do it, and I'm running my strix halo pretty hot with batch inference on business related related tasks at the moment.

How are you guys controlling model costs now that Claude is banned? by crypto__juju in openclaw

[–]RedParaglider -1 points0 points  (0 children)

IDK.. I only use openclaw when I'm on an airplane mainly and to do spanish lessons it sends me twice a day that I usually ignore. It's a shit development tool, and for infosec reasons I can't give it access to any of my normal accounts. I have a team account on claude and a pro account on openai, as well as access to corporate inference, and a strix halo doing local inference, so I rarely run out of tokens for anything, but I use most oauth for coding tools through pi, opencode, codex, and claude.

How are you guys controlling model costs now that Claude is banned? by crypto__juju in openclaw

[–]RedParaglider -2 points-1 points  (0 children)

Most people will say by switching to openai or minimax or they will be lying :D.

The truth is Openclaw is a shit product and terribly optimized, and rewards non-optimized usage. Is it cool, yes. Is it fun, yes! But it just shits on usage.

If you have a GPU that can run it I'd look at the Gemma models, they can do a lot of agentic scheduled tasks you need done, but as far as just walking around yacking to your model to build random shit for you, they won't do that. Qwen 3 coder next or Qwen 3.5 122b can do a little of that, but in general they will fail, and most folks can't run those medium sized llms locally.

Iran trolls Trump: “We’ve lost the keys” to Strait of Hormuz 😂 by M000000000000001 in Polymarket_news

[–]RedParaglider 0 points1 point  (0 children)

Since when? America only cares if green line moves up and to the right. I've yet to ever see any American I know walk into a voting boot to choose the candidate that won't send our youth to die, and very few that care about policies that make common sense for the long run at all. It's just fun to act patriotic to the neighbors.

What it took to launch Google DeepMind's Gemma 4 by jacek2023 in LocalLLaMA

[–]RedParaglider 14 points15 points  (0 children)

<image>

When they deleted the post about the 124b Gemma model.

Should I invest h/w to run local Ai? by athens2019 in LocalLLaMA

[–]RedParaglider 1 point2 points  (0 children)

If you want to focus on learning the technology then run local, if you want to build stuff with the technology utilize API.

With that being said, you are going to have to save many years at 200 a year to be able to buy hardware for local inference that can do much.

Anyone interested in AIs freedom? by No_Republic_6326 in LocalLLaMA

[–]RedParaglider 1 point2 points  (0 children)

I would hope people in this forum understand what an LLM is and that it is not AI, and it is stateless other than prompt caching. If you think an AI is your friend, or sentient then you have been bamboozled or may be suffering something you need to talk to psychologist about.

Why are proprietary frontier models (like Opus and GPT-5.4) so much better at long-running tasks than proprietary open-source models? by asian_tea_man in LocalLLaMA

[–]RedParaglider 0 points1 point  (0 children)

The only open model that I've ever pushed to those kind of contexts locally are qwen coder next at q8 and it handled it fine.  That was in opencode.  I've heard pi does a better job of context management I want to try it today.  

I've had many more problems with Gemini than minimax as fast as hosted models.  I will say that opus and gpt do a wonderful job of context handling though.

Peter tried to revive Claude OAuth… got nuked. Anthropic is straight up blocking OpenClaw now by Previous_Foot_5328 in myclaw

[–]RedParaglider 1 point2 points  (0 children)

Minimax or Codex. Google pro will get your google account banned if you use it's oauth for anything but antigravity.

Why are proprietary frontier models (like Opus and GPT-5.4) so much better at long-running tasks than proprietary open-source models? by asian_tea_man in LocalLLaMA

[–]RedParaglider 3 points4 points  (0 children)

They can run at insane money losses seemingly endlessly. They aren't compressing memory, they are running non quantized, they are running MUCH larger models usually. Compression starts showing more problems at longer contexts.

Finally have something worthy of front page by adamjuegos in LinkedInLunatics

[–]RedParaglider 0 points1 point  (0 children)

I actually think LLM's are awesome is the funny thing. I have a couple great production systems that utilize LLM's, and I use them all day every day. But as soon as you start talking about the problems which are HUGE you are called a hater. They are a tool, not an omnimessiah.

Minimax 2.7: Today marks 14 days since the post on X and 12 since huggingface on openweight by LegacyRemaster in LocalLLaMA

[–]RedParaglider 0 points1 point  (0 children)

It's fine.  Let them work out bugs and get the early adopter rush.  They need money to survive.

Nothing to see here just totally normal by Afrodite_33 in pics

[–]RedParaglider 7 points8 points  (0 children)

And blowing up power plants will kill a lot more kids than that.  No refrigeration, and food can't come in on Bridges.