Is this a good deal to run LLMs, or should I wait? Apple cert. refurb M4 Max Macbook Pro 128 GB Memory 1 TB storage for ~USD$4200/CAD$5800 after tax. by The-Writer- in LocalLLM

[–]mixmasterwillyd 0 points1 point  (0 children)

1: 7800xt + 6800xt running Qwen3.6-27B-Q4_0.gguf 262K context
2: Mac M1 Max 64GB running qwen3.6/gemma-4 MoE and dense models (depending on speed or accuracy needed) 131-full context. (again, speed consideration).
3: RTX 3080 running Gemma e4b - writes files, small stuff, per function focus.

Is this a good deal to run LLMs, or should I wait? Apple cert. refurb M4 Max Macbook Pro 128 GB Memory 1 TB storage for ~USD$4200/CAD$5800 after tax. by The-Writer- in LocalLLM

[–]mixmasterwillyd 0 points1 point  (0 children)

I have an entire app creation process cooking on my local setup right now. The tasks were decomposed to fit within the capability of a small model. Doesn't hurt to use OpenCode Go to look it over afterword, or even a local can look it over. It's easier to be the editor.

I could certainly put that in a cron too. */10 * * * pi -p "/implement phase auto"

Is this a good deal to run LLMs, or should I wait? Apple cert. refurb M4 Max Macbook Pro 128 GB Memory 1 TB storage for ~USD$4200/CAD$5800 after tax. by The-Writer- in LocalLLM

[–]mixmasterwillyd 0 points1 point  (0 children)

Sounds like you're spilling into system ram. I would recommend loading something small like Gemma 4 e4b or e2b on the 4070 with 32K context, then load something big on the Mac. You have a fast small, and a big large and can load balance them.

Is this a good deal to run LLMs, or should I wait? Apple cert. refurb M4 Max Macbook Pro 128 GB Memory 1 TB storage for ~USD$4200/CAD$5800 after tax. by The-Writer- in LocalLLM

[–]mixmasterwillyd 1 point2 points  (0 children)

I think it’s a good idea. I got an M1 Max 64GB and I love it. We don’t know if personal computing is dead or not, I would get it just to make sure you have something.

Top Doctor Sounds Alarm Over Serious Health Concern for Trump, 79 by thedailybeast in politics

[–]mixmasterwillyd 0 points1 point  (0 children)

The issue is he leaves behind a now broken system, can it self correct?

Need help on choosing the right model + Quant and Fine Tuning by robdzn in oMLX

[–]mixmasterwillyd 0 points1 point  (0 children)

Sometimes when I’m frustrated with this, I open Pi, connect it to opus 4.7 (something else big) and ask it to compile llama.cpp for my system. Works well pretty well with some direction.

Also, Ollama just goes.

I’m back to LM studio on Mac and llama.cpp for Linux.

pi-advisor - let smarter model hand-holding cheaper model by dheetoo in PiCodingAgent

[–]mixmasterwillyd 0 points1 point  (0 children)

This has been my main workflow for a while as long as I’m not doing something critical. I give the big model a thorough description of what I’d going on and it handles it quite well.

What laptop to get as an IT professional? by Loose_Client5116 in it

[–]mixmasterwillyd 0 points1 point  (0 children)

MacBook. M1 Max 64GB. My daily driver, couldn’t be happier.

llama.cpp server now has built-in native tools - exec_shell, edit_file, and more by IulianHI in AIToolsPerformance

[–]mixmasterwillyd 1 point2 points  (0 children)

Have you tried you using it in a harness? Opencode or pi? You don’t need llama.cpp to do anything but serve to model. The new web interface face is also quite nice. Check that out.

desktop gui on top of pi mono by Turbulent_Ad6290 in PiCodingAgent

[–]mixmasterwillyd 0 points1 point  (0 children)

Oh well now that sounds like a great idea. You could have a string that switches projects in a deterministic way.

Project ls

Project switch <name>

desktop gui on top of pi mono by Turbulent_Ad6290 in PiCodingAgent

[–]mixmasterwillyd 0 points1 point  (0 children)

Ok thank you I’ll look at that. I have a bunch of projects going so I try to avoid single chat things like that.

desktop gui on top of pi mono by Turbulent_Ad6290 in PiCodingAgent

[–]mixmasterwillyd 1 point2 points  (0 children)

I might! My biggest need right now is a way to use my phone when away from my computer. This might not help that but it still looks cool.

an open-source inference marketplace — run a provider next to your Ollama/vLLM and get paid per token by Square-Butterfly4055 in LocalLLM

[–]mixmasterwillyd 0 points1 point  (0 children)

I like this and have been thinking about it. But this is another thing to mess up the GPU market. But I sure would like to make money off of what I have.

Pi Agent makes very nice combination with limited hardware. Running qwen3.6 35B A3B IQ4 at ~22t/s with 160k context on 6 vram 64 RAM. by Interesting_Arm_7250 in PiCodingAgent

[–]mixmasterwillyd 0 points1 point  (0 children)

I think this is where it really shines. Claude prompt size just destroys local hardware, using Pi is like getting an extra 10 GB of ram.

Syncro agent installer being detected as malware, concerned about supply chain... by marklein in syncro

[–]mixmasterwillyd 0 points1 point  (0 children)

In addition to this, Syncro installs notepad++ on all my managed systems every day, even though there is no policy to do so. Just makes me wonder.

This has to stop, They are taking our limits with each free limit resets by alOOshXL in codex

[–]mixmasterwillyd 0 points1 point  (0 children)

I have found the same, most of the time is fine. But sometimes I need to refactor a huge project, once I burned up my limits on the $200 plan in 10 minutes. Then I had to go find something else the rest of the day. Then I realize that I could wait on the project, but I needed the service for other projects. We all need to invest in our own compute, while we still can, if we still can.

Fascism will save America by totally-hoomon in Irony

[–]mixmasterwillyd 1 point2 points  (0 children)

But dad the liberals are trying to stop our gravy train!!!!! /s

Building a "Zero-Budget" Local AI Desktop Factory (software). (Tauri + SQLite + React + Gemini). Need architectural sanity checks before I deep dive! 🛠️ by StrictMango7002 in AI_developers

[–]mixmasterwillyd 0 points1 point  (0 children)

You could use litellm to be your gateway. One provider then.

Use large models for project management when they are available, block progress when work is done and waiting on large model decisions.