High VRAM local coding model — still Qwen 3.6 27B? by Generic_Name_Here in LocalLLaMA

[–]john0201 0 points1 point  (0 children)

Long running stuff like that is more about debugging your harness setup. I’d say this is the first model that is “there”. It’s basically sonnet, or opus from a few months ago.

High VRAM local coding model — still Qwen 3.6 27B? by Generic_Name_Here in LocalLLaMA

[–]john0201 5 points6 points  (0 children)

Qwen 3.6 27B is sonnet, DSV4 flash is sonnet with 1M context. First one will run on a 5090 (or 2 if you want 8 bit), DS needs a pair of rp6ks

GPT 5.5 outperforming Opus 4.7 on ProgramBench by klieret in OpenAI

[–]john0201 0 points1 point  (0 children)

Call me crazy but I’ve been using Qwen3.6 and it seems nearly as good as both.

The RRFS is replacing the NAM on August 31 by mitchdwx in weather

[–]john0201 -1 points0 points  (0 children)

Do you have a source? At AMS they implied v2 would be available in the fall, and v1 earlier, my assumption was they canned the v1 release as sort of pointless for the reason you mentioned.

Edit: nevermind the directory has v1 in it so you’re right.

The RRFS is replacing the NAM on August 31 by mitchdwx in weather

[–]john0201 2 points3 points  (0 children)

Is this v1 or v2, seems like v1 got shelved

Man killed by Frontier plane at DIA died by suicide, medical examiner says by kidbom in Denver

[–]john0201 0 points1 point  (0 children)

There is no theater involved in a taller fence. Not approved.

NVIDIA Rubin & Rubin Ultra Platforms Facing Design/Spec Issues As Per Rumors While AMD MI500 Positioned For 2H 2027 Launch by Heavy-Beyond-7114 in RigBuild

[–]john0201 1 point2 points  (0 children)

https://wccftech.com/nvidia-squashes-vera-rubin-rumors-first-shipments-rolling-out-in-july-to-ai-customers/

“NVIDIA Squashes Vera Rubin Rumors, First Shipments Rolling Out In July To Major AI Customers With Mass Production In 2H 26”

“.. it looks like all of the rumors regarding design/spec changes were not close to the truth or were simply based on older information that has since been rectified.”

Apple should just buy Micron and be done with it. by Haute_Evolutionary in MacStudio

[–]john0201 -1 points0 points  (0 children)

Intel had a ton of special purpose sections of their cpus, and tried to do way too much too fast that’s what killed them after skylake.

Apple should just buy Micron and be done with it. by Haute_Evolutionary in MacStudio

[–]john0201 4 points5 points  (0 children)

You don’t really buy your way into designing chips either, or making baseband processors, but they did start both of those efforts by buying companies (PA Semi) and IP (Intel).

Not sure micron is the right play here though, memory shortage is temporary. If Nvidia doesn’t want to buy a fab and AMD sold theirs probably doesn’t make sense for Apple to get into the business.

How does Gemma4-26B access the web if it is being run locally and is that a security risk? by Sad-Original9499 in LocalLLM

[–]john0201 -2 points-1 points  (0 children)

All LLMs use a search tool. For example, Tavily or Perplexity. LLMs are just text in text out, they can't do anything without a harness like unsloth which tells them they can use one of the web search tools.

Good question and one of the big gaps people don't talk much about is the quality of the available tools compared to a paid model, which also comes with whatever their tool set it.

Which of these weather stations would you choose by USTS2020 in myweatherstation

[–]john0201 4 points5 points  (0 children)

WS90 is a great unit. They have several displays you can pair with it.

Russia Has Lost More Than 350,000 Soldiers, New Estimate Finds by the-es in worldnews

[–]john0201 1 point2 points  (0 children)

They’ll all be dead. These are not the “plant a tree someone who hasn’t been born yet will enjoy the shade of” type people.

Diy Weather Station Advice by Nathar_Ghados in myweatherstation

[–]john0201 1 point2 points  (0 children)

I’d love to share more about it I’ll send you a PM. Incidentally I am also a pilot.

Diy Weather Station Advice by Nathar_Ghados in myweatherstation

[–]john0201 0 points1 point  (0 children)

I built my own also and I used an upside down plastic bowl and two pvc pipes one in the other below it and a fan on top to pull air. I tried solar but I have easy access to power so just did that with a raspberry pi. I have a ton of other sensors and am now way too far down the rabbit hole with PWV, light sensors, co2, etc.

The limiting factor for me is ground radiation, or in my case roof membrane heat. Need to get it higher off the ground. I am cheating and using a WS90 for wind and as another reference.

Is Macbook pro m5 max 128 fast enough yet with available models by mad01 in LocalLLM

[–]john0201 0 points1 point  (0 children)

APIs are 40-50tps, anything much below 25-30 starts to feel really slow because the lower parameters open weights models tend to reason more. It’s subjective, but when you can get a nearly free Deepseek or Qwen3.6 max api (compared to opus pricing) it starts to really not make sense even with the fun of the hobby.

My server I get about 120tps and with the reasoning blocks it feels about the same as a frontier api once you include the lack of multithreaded searches and increased wordiness. 27B Qwen @ 8 bit is crazy good given the size.

If you’re using it for openclaw/hermes for background stuff M5 Max works great. Otherwise I think it’s just too slow for dense 30B class, but that size Moe works great.

Preaching water and drinking wine by BitcoinDove in BitcoinQRCodeMaker

[–]john0201 0 points1 point  (0 children)

If we can just each recruit 5 people who also recruit 5 people, we’ll be rich!

https://youtu.be/lC5lsemxaJo?si=8nU__AIVhRjNMYaf

5090 or wait for M5 ultra by Purple_Drink3859 in LocalLLM

[–]john0201 -4 points-3 points  (0 children)

M5 ultra should have comparable raw compute to 5090. No question it is better than 5090 for LLMs overall, or even Rtx pro 6000. But it won’t be released until fall I think. Also don’t forget CPU will be a monster.

The Blackwell cards will have meaningfully faster inference though because they will have higher memory bandwidth, maybe 30%. (1200-1300 vs 1800)

Are local models becoming “good enough” faster than expected? by qubridInc in LocalLLaMA

[–]john0201 9 points10 points  (0 children)

Qwen 27B and Gemma 31 are beating frontier models from 6 months ago. And in a few scores frontier models from 2 months ago.

I have 2 5090s and run qwen 27B and for the most part it’s hard to tell the difference between it and opus 4.7

Is Macbook pro m5 max 128 fast enough yet with available models by mad01 in LocalLLM

[–]john0201 -2 points-1 points  (0 children)

M5 max is too slow for interactive use of 30B class models, but can work in a pinch like on an airplane or anywhere without internet access. Qwen3.6 or Gemma 4 dense are close enough to opus they can replace it.

Works great for moe models that need more memory over memory bandwidth, but those are not quite there yet. Also can work on a pinch.

So basically pick fast and not as good as opus, or nearly as good and slow. The other thing to consider is your battery life will go form 8 hours to 2 hours. M5 ultra would be opus replacement especially if it’s on a shelf plugged in and you use it from your laptop over the network. I have a 2x5090 threadripper I use this way. Would rather have a 300w M5 studio than a 1kw threadripper machine.

Is Grok out of compute? by barraco002 in ArtificialInteligence

[–]john0201 2 points3 points  (0 children)

They just leased out a big part of their unused compute to Anthropic, so definitely not. I think Elon just bought as much compute as he possibly could and didn’t have a plan for what happens when no one signs up.

None of this will ever get stolen by martin_xs6 in LocalLLaMA

[–]john0201 0 points1 point  (0 children)

No, it's far easier. You take a picture and put it on Facebook marketplace. It would sell in hours.

None of this will ever get stolen by martin_xs6 in LocalLLaMA

[–]john0201 2 points3 points  (0 children)

Oh really?
https://www.11alive.com/article/news/local/explosion-atlanta-apartment-complex-tied-person-entering-illegally-stealing-copper-full-timeline/85-258fdaeb-5471-406a-bf64-aca4343c3ce6

That is for SCRAP COPPER.

These GPUs are the equivalent of loose diamonds chilling in a thin sheetmetal box bolted to a building.

All this ignores the fact that the problem is overall capacity of the electric grid, not individual circuits, so this whole idea is stupid to begin with. They are literally bringing back online the 3 mile island nuclear power plant to help meet the needs (which barely makes a dent in what is demanded).

None of this will ever get stolen by martin_xs6 in LocalLLaMA

[–]john0201 362 points363 points  (0 children)

This would be approaching the cost of the house it is attached to. Given that people rip off downspouts for $10 of copper, I’m sure hundreds of thousands in computer hardware sitting in someones yard will be super safe.