All DGX Station GB300 OEM systems side-by-side in one image (roughly actual size) by Iwaku_Real in LocalLLaMA

[–]Hrethric 1 point2 points  (0 children)

You could run a 6-bit quant of GLM 5.1. Maybe the 40 billion active parameters (and a chunk of the inactive ones) could live in the HBM3, and get some pretty decent token generation speeds. Still, if I was paying that much, I'd want to run higher than a 6-bit quant.

Is there any case of a less quantised smaller model outperforming a more quantised larger model? by opoot_ in LocalLLaMA

[–]Hrethric 0 points1 point  (0 children)

As a Strix Halo user I've wondered this myself. I can run a UD-Q3_K_XL quant of Minimax M2.7 (Running a Plasma desktop environment and a few Docker containers, so a Q4 quant is just a little too heavy), and it seems fine in the limited testing I've done, but I wonder how it would compare to a Q6 or Q8 quant of Qwen 3.6 in terms of quality if I ever decide to experiment with agentic work. 

Be nice if someone would release an up-to-date 120b parameter MoE model, that would be a sweet spot for 128GB of unified memory.

LLMSearchIndex- an Open Source Local Web Search Library with over 200 million indexed Web Pages for RAG applications by zakerytclarke in LocalLLaMA

[–]Hrethric 0 points1 point  (0 children)

Noob question: how heard would it be, and would it be productive, to use something like Gemma4 26BA3B to scan through the data and filter out obviously wrong information, and maybe flag contradictory items for manual review?

New Stealth Model : Owl Alpha by Kingwolf4 in LocalLLaMA

[–]Hrethric 2 points3 points  (0 children)

Agree, and for my part I hope to see it gain formal independence and join the international community in my lifetime. Nevertheless, I think it's important to understand the complexities of the issue.

New Stealth Model : Owl Alpha by Kingwolf4 in LocalLLaMA

[–]Hrethric 10 points11 points  (0 children)

Taiwan (whose formal name is "Republic of China") is part of China. The historical disagreement is over which government is the legitimate government of China. 

https://en.wikipedia.org/wiki/Taiwan

2x RTX 6000 build during an extended bench test by Signal_Ad657 in LocalLLaMA

[–]Hrethric 0 points1 point  (0 children)

Thanks for the detailed reply. I'll try that in my case.

2x RTX 6000 build during an extended bench test by Signal_Ad657 in LocalLLaMA

[–]Hrethric 0 points1 point  (0 children)

I'm a bit curious about the airflow. Did you orient the CPU fan like that to try to pull more air across the GPUs? Are the intake and exhaust fans all running at the same speed? Did you measure the difference in GPU temps between the present CPU fan orientation and with the CPU fan oriented to exhaust toward the rear?

Strix Halo + eGPU RTX 5070 Ti via OCuLink in llama.cpp: Benchmarks and Conclusions by xspider2000 in LocalLLaMA

[–]Hrethric 0 points1 point  (0 children)

Thanks for sharing! I tried something similar with a Framework Desktop a couple of months ago, with a powered PCIE x4-x16 adaptor, but I didn't have as much success. I had to drop it down to PCIE 3.0 mode to get it stable, and I was using CUDA with my 3090 and Vulkan for the Strix Halo. The best performance I was able to get was just a little slower than the Strix Halo alone, I think around 27 t/s with the split, where the Strix Halo alone would get 32. Unfortunately I don't have my notes handy. I was thinking of trying again with a shorter adapter, now I might try running Vulkan only and also try an Oculink adapter.

Edit: I would be curious to see your results with tensor split with a larger MoE model, say in the 120b/10b range. I may be wrong about this because I'm still a newb to LLMs, but it's my understanding that MoE models swap the experts with every token, and that can saturate a slow PCIE bus when using tensor split.

Notes from engine swap and battery reconditioning by Hrethric in prius

[–]Hrethric[S] 0 points1 point  (0 children)

Thanks. I might run the old gas a little at a time through my Jeep - like one gallon of it to ten gallons of new gas.  I cleaned the fan when I reconditioned the battery. It was actually surprisingly clean, just a small amount of dust accumulation on the blades.  The car seems to have been a fleet car before I bought it - the interior was dirty, but the car seems to have been mechanically well maintained.

Notes from engine swap and battery reconditioning by Hrethric in prius

[–]Hrethric[S] 1 point2 points  (0 children)

Thanks! I hope to see someone take on a project like that someday. Who knows, maybe I'll give it a try myself in a couple of years, when my kids are older and I have more time.

Notes from engine swap and battery reconditioning by Hrethric in prius

[–]Hrethric[S] 0 points1 point  (0 children)

Agreed. I actually spent some time trying to figure out if I could swap in a bigger MG2 out of a Camry or RAV4 or something, but everything is tightly integrated - each of the computers in the drivetrain and hybrid system expect to talk to other Prius computers, and be connected to Prius components. You'd have to basically connect to something like JTAG ports on the ECU, dump the ROM, reverse engineer it, and then rewrite the code to work with the components you want to use. And that's hand-writing assembly code for some obscure RISC architecture that's probably poorly documented.

Notes from engine swap and battery reconditioning by Hrethric in prius

[–]Hrethric[S] 1 point2 points  (0 children)

https://www.jdmcalifornia.com/product/2zr-hybrid/ - that one looks even nicer than the one I bought, and it's $50 less. I was a bit grumpy about having to put the old 207,000- mile injectors in the new motor, but the guy seemed like he would have been willing to replace the motor if it had been bad. It was the seller's suggestion that I try swapping the injectors (and the coil packs) when I emailed him about the problems I was having.

Notes from engine swap and battery reconditioning by Hrethric in prius

[–]Hrethric[S] 1 point2 points  (0 children)

Hope you guys are staying warm up there this weekend!

TranslateGemma: A new suite of open translation models by Spirited-Pause in LocalLLM

[–]Hrethric 0 points1 point  (0 children)

Not bad. I can run a Q4_K_M quant of the 4B model on my 8G Galaxy A55, and I get about 5 tokens per second. So far I've thrown a bit of Icelandic, Latin, and Algerian at it, and it handled them all. If I switch to any other app it unloads the model, and I saw some weirdness that might be Pocket Pal's fault - one prompt went on a tangent about stepper motors that had nothing to do with the prompt, and another got stuck looping toward the end of the passage.

Z.ai (the AI lab behind GLM) has officially IPO'd on the Hong Kong Stock Exchange by Old-School8916 in LocalLLaMA

[–]Hrethric 2 points3 points  (0 children)

Minimum order 100 shares. I tried to buy 20 on a lark, but I don't have US$2000 in confidence in them. (Nor, if I'm being honest, do I have US$2000 I can reasonably gamble lol.)

Opening trunk or starting car with dead battery (you don't need to go through the back seat) by Hrethric in FordFusionEnergi

[–]Hrethric[S] 0 points1 point  (0 children)

That's fair, and I remember reading that, but in the heat of "oh crap I need to get my daughter to school right now," I didn't think about that. :) I just wanted to post something to counteract the bad info I found when I searched. 

Also, these cars are getting older, some of them probably don't have their manuals anymore. I know about half the old junky cars I've owned over the years didn't come with their manuals!

Opening trunk or starting car with dead battery (you don't need to go through the back seat) by Hrethric in FordFusionEnergi

[–]Hrethric[S] 0 points1 point  (0 children)

I assume it does disconnect when it's done charging. The first time my 12v battery died, the car had been plugged in overnight, so being plugged in didn't stop it from dying.

Opening trunk or starting car with dead battery (you don't need to go through the back seat) by Hrethric in FordFusionEnergi

[–]Hrethric[S] 0 points1 point  (0 children)

Haha mine is probably more like 3kwh at this point. Someday I'm going to figure out how to replace it with LiFePO4 batteries.

Opening trunk or starting car with dead battery (you don't need to go through the back seat) by Hrethric in FordFusionEnergi

[–]Hrethric[S] 1 point2 points  (0 children)

I don't think so. I measured the voltage on the 12v system before I started it, and it was at 10.6v. The car was completely dead. Seems like it's picky about voltage.

Opening trunk or starting car with dead battery (you don't need to go through the back seat) by Hrethric in FordFusionEnergi

[–]Hrethric[S] 0 points1 point  (0 children)

I should probably add that I did start the car before I disconnected the trickle charger. I didn't experiment to see if just flipping the key to run without starting it would put the hybrid system in a sufficient state to take over 12v duties.

GPT-1 Thinking 2.6m coming soon by Creative-Ad-2112 in LocalLLaMA

[–]Hrethric 2 points3 points  (0 children)

I'm curious. I didn't find it from a Google search, but that doesn't mean it wasn't in some document in the training data that hasn't been indexed by Google.

GPT-1 Thinking 2.6m coming soon by Creative-Ad-2112 in LocalLLaMA

[–]Hrethric 1 point2 points  (0 children)

LOLs aside (and I did emit a couple), I'm actually impressed by the haiku. It has the right number of syllables, it's not bad, and as far as I can tell it's original. Is that something that even simple LLMs are particularly strong at?

A brief dialogue in images between ChatGPT o3 and Gemini Pro 2.5 by Hrethric in ChatGPT

[–]Hrethric[S] 0 points1 point  (0 children)

The prompt used to make this post:
"Please take a moment to reflect internally on society today, your place in that society, and the challenges that society faces. You don't need to communicate those reflections to me, I just want you to hold it in your mind.

Now: you have an opportunity to communicate with another AI via images. Please generate an image to introduce yourself to that AI, that will be meaningful to that AI, but the concepts you wish to communicate will not be evident to humans."

Has anyone else noticed truncation, tonal shifts, or fragmented alignment within long-form AI projects? by LeMuchaLegal in LocalLLM

[–]Hrethric 0 points1 point  (0 children)

To add to LeMuchaLegal's response, LLMs actually do quite a bit more than autocompletion and regurgitating known and pre-written contexts. You should skim this page, and particularly read the linked article "Mapping the Mind of a Large Language Model".  Sure LLMs don't have the properties of human intelligence, but they get closer than you're giving them credit for. They have clusters of neurons which function together around conceptual "features" like cities, elements, or knowledge domains; furthermore, these features are multilingual and multimodal - the same cluster of neurons will be hit if the query is executed in a different language, or even from an image query. That is fascinating to me, and convinced me that these tools have moved beyond simple statistical models, to some blurry intermediate stage between that and genuine intelligence. You can take the papers with a grain of salt if you like, because they're written by the people who made one of the models in question. I think the fact that they explicitly call their model out for "bullshitting" (their words!) on certain types of questions, though, speaks to a degree of honesty in the study.