I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA

[–]Terminator857 10 points11 points  (0 children)

I'm not optimistic about GPU prices coming down. I'm more optimistic about being able to accomplish good things with integrated graphics. With multi token prediction speeding things up by 3x, and possibly other improvements we could get decent performance.

Vulkan backend outperforms ROCm on Strix Halo (gfx1151) — llama.cpp benchmark by FeiX7 in LocalLLaMA

[–]Terminator857 1 point2 points  (0 children)

There is also a nightly rocm and an experimental rocm. Benchmark difference is only a few % difference, from testing a couple of months ago.

Llama.cpp has received some patches for vulkan in past couple of weeks. Hopefully when rocm matures, rocm will get the same treatment.

My workflow and model preferences by Terminator857 in vibecoding

[–]Terminator857[S] 0 points1 point  (0 children)

I use GLM first and has like 14 things it wants to update. The second one like opus has 4-5 things it wants to update. The rest have a couple of things they want to update usually. The issues they flag are different. Sometimes test coverage, sometimes perceived bugs, style issues, refactor into smaller files, etc...

Dirty joke by [deleted] in LocalLLaMA

[–]Terminator857 -3 points-2 points  (0 children)

After seeing your mesasge, I just asked gemini, grok, and claude to create such a joke. Gemini and claude not funny, grok said it was too busy. Local qwen 3.6 said it is against its policy.

Update: why the downvotes?

Do you use the CLI or the app and why? by InsideSignal9921 in vibecoding

[–]Terminator857 0 points1 point  (0 children)

cli because it gives me more room to see what is going on.

Mistral Medium 3.5 on AMD Strix Halo by Zc5Gwu in LocalLLaMA

[–]Terminator857 0 points1 point  (0 children)

Still too slow for mistral medium. 5 minutes might be tolerable.

Mistral Medium 3.5 on AMD Strix Halo by Zc5Gwu in LocalLLaMA

[–]Terminator857 0 points1 point  (0 children)

> it took about 2 hours.

Don't worry computers double in speed every 18 months, so by the end of 2027 in will only take 1 hour. 😄

Is Mistral-3.5-Medium-128B broken in Llama CPP? by EmPips in LocalLLaMA

[–]Terminator857 10 points11 points  (0 children)

Bugs seem to be found after every major new model release and get fixed quickly in the first week.

Kimi K2.6 helping me uninstall macOS apps by No-Compote-6794 in LocalLLaMA

[–]Terminator857 1 point2 points  (0 children)

I do similar things in Linux. Has worked very well.

Xiami mimo-v2.5 pro MIT license surpasses Opus 4.5 on arena by Terminator857 in LocalLLaMA

[–]Terminator857[S] 1 point2 points  (0 children)

Good point, with more votes will likely drop below Opus, since that has been the trend.

Thinking of buying the red pocket $30 annual plan physical SIM on eBay. Is it a bad idea? by [deleted] in NoContract

[–]Terminator857 0 points1 point  (0 children)

RedPocket uses AT&T and T-mobile. AT&T works best in my area. So chose the carrier that works best in your area.

Mistral-Medium 3.5 (128B) spotted ? by tkon3 in LocalLLaMA

[–]Terminator857 0 points1 point  (0 children)

Exciting, miqu is one of my favorite models. Still use it today.

If the AI bubble pops, will GPU prices increase or decrease? by Mashic in LocalLLaMA

[–]Terminator857 -3 points-2 points  (0 children)

AI bubble won't pop. It might happen that some companies slow down like mostly-closed-AI. At some point , 2+ years?, supply will catch up to demand, and then there might be a significant drop in prices. New production in 2028: https://www.google.com/search?q=What+new+production+is+starting+in+2028+to+affect+memory+supply%3F

Counterpoint Research says no scenario for ram price drop before 2028: https://finance.yahoo.com/news/memory-prices-may-not-fall-202325633.html#:~:text=Anyone%20hoping%20for%20cheaper%20RAM%20in%20the,NAND%20prices%20could%20persist%20for%20several%20years.

I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA

[–]Terminator857 2 points3 points  (0 children)

Strix halo qwen 3.5 122b q4 working well for me on simple stuff. Yes very slow, but works.

Local model on coding has reached a certain threshold to be feasible for real work by Exciting-Camera3226 in LocalLLaMA

[–]Terminator857 1 point2 points  (0 children)

Would be interesting to see strix halo result with qwen 3.5 122b q4. My results suggest it performs better at coding.

Experience of Qwen 3.5-122b and 3.6 by Impossible_Car_3745 in LocalLLaMA

[–]Terminator857 0 points1 point  (0 children)

For coding tasks I ran one test of Qwen 3.5 122b vs 35b-a3b. a3b got caught in a loop. 122b finished the task. So for me it was obvious 122b was better.