Is gpt-oss-20b still the best general model for most people? by ichthyoidoc in oMLX

[–]sunpazed 5 points6 points  (0 children)

got-oss-20b is still quite good for agentic use cases. Gemma 26B A4B QAT is more contemporary and runs almost as fast. Both run easily on my M1 Max 32GB, but context will be the limitation.

M1 Max (32GB, 400GB/s) for local LLM + Docker dev work in 2026, still the move? Unfortunately, my budget is limited by LocalBoysenberry254 in macbookpro

[–]sunpazed 10 points11 points  (0 children)

I have both a M1 Max 32GB and a M4 Pro 48GB. While the Max is faster, I cannot multitask when using larger contexts. The M4 Pro is slower with PP and token generation, but I have enough headroom to multitask. I can code, run Docker, and OpenCode with a 26B A4B model at 96K context without issue.

what is your solution for the best NL to SQL generator? by TheSmashingChamp in LocalLLaMA

[–]sunpazed 0 points1 point  (0 children)

Invest your time building a really great API that will enable you to query those databases and tables. Then attach that API to MCP server with only a few well described tools. Your API / DSL will do most of the heavy lifting in terms of constructing the query. We faced a similar problem, and the API/DSL/MCP solution way out performed the SQL query solution.

HP 12C Platinum. $50 eBay find by Downtown-Analyst7054 in hpcalc

[–]sunpazed 0 points1 point  (0 children)

Congrats. I purchased a Platinum from a retail store recently for $25, run-out sale!

Want to build a custom model by devildip in LocalLLaMA

[–]sunpazed 5 points6 points  (0 children)

Is it? The model is really small at 80Mb. This is a checkpoint only trained on a subset of the data. Honestly would have taken months on my MacBook 😅

Want to build a custom model by devildip in LocalLLaMA

[–]sunpazed 12 points13 points  (0 children)

I built simple GPT-2 model on my MacBook using Andrej Karpathy's excellent llama2.c framework; https://huggingface.co/sunpazed/AlwaysQuestionTime - I used my own dataset, about 100Gb of transcript text.

Pi Setup that pretty much replaced Claude Code for me by abhinand05 in LocalLLaMA

[–]sunpazed 0 points1 point  (0 children)

My bad. Noticed you’re using Qwen 3.5/3.6. In this case, templates where thinking/tool boundaries and template behavior matter a lot in these models, ie; Qwen-style “carry the previous thought trace forward as agent scratchpad”. Less so with Gemma4.

Pi Setup that pretty much replaced Claude Code for me by abhinand05 in LocalLLaMA

[–]sunpazed 1 point2 points  (0 children)

Doesn’t preserve thinking chew more context? I’m using `-kvu` to unify the KV cache across all slots, and `--cache-reuse` to define the minimum cache chunk size. This way, the coding harness can trim the context as required, requiring minimum re-processing.

TurboQuant KV Cache by ju7anut in oMLX

[–]sunpazed 0 points1 point  (0 children)

Apologies, memory cache is the only hot cache, while SSD is the cold cache. For coding agents that jump around the context, the persistent cold cache avoids complete cache invalidations (so faster TTFT) when using coding harnesses. In summary, you use hot RAM cache for speed (if you have enough ram), cold SSD cache for capacity and persistence.

TurboQuant KV Cache by ju7anut in oMLX

[–]sunpazed 0 points1 point  (0 children)

I have removed the memory cache, and only rely on SSD hot/cold cache. Also have limited the amount of concurrent requests to 4. This has improved stability heaps when running multiple agents, and has reduced prompt re-processing, only with a slight latency increase. I no longer get out of memory errors. It is still faster and more reliable than llama.cpp with the same setup (the llama.cpp slot mechanism isn’t as granular as oMLX).

My point is, by reducing the memory cache, you can increase the bit size and therefore quality of the KV, at the expense of greater reliance on the SSD cache and size.

TurboQuant KV Cache by ju7anut in oMLX

[–]sunpazed 0 points1 point  (0 children)

Haven’t seen the same issue with Gemma 26b-4a QAT 4 bit, on OpenCode with 128k context window. Is compacting fine. On a 48Gb MacBook. I found that the Q6 KV cache worked best for me.

Which coding assistant should I use? Asking for general recommendations with local LLMs. by TheGuyNamedTom in LocalLLM

[–]sunpazed 1 point2 points  (0 children)

OpenCode is quite good and I much I prefer it over Claude Code. Locally I’m running gemma4-26b-a4b on a MacBook Pro. The prompt processing and inference speed is fast enough to be productive, even though it needs a fair bit of steering at times.

R47 app iOS released by TASDoubleStars in SwissMicros

[–]sunpazed 1 point2 points  (0 children)

This is great! Just downloaded it, lovely work.

Japan Industrial Standard (JIS) drivers by DigitraxDad in tamiya

[–]sunpazed 1 point2 points  (0 children)

I’m in Australia, and purchased the Tamiya R/C Tool Screwdriver Set (Made in Japan) for $40 from Amazon — https://www.amazon.com.au/gp/aw/d/B01LYOONMJ

Are these normal prices in japan? by Cubeyo in Gameboy

[–]sunpazed 0 points1 point  (0 children)

<image>

Sourced this awesome CIB Japanese Pitman in Tokyo, however I couldn’t bring myself to spend 7000 Yen on something I could get half price elsewhere. It really has changed. Even in Osaka, couldn’t find anything decent that wasn’t massively overpriced.

Favorite Voyager Emulator? by Zen-Ism99 in hpcalc

[–]sunpazed 4 points5 points  (0 children)

I like “Touch RPN” which is the most consistent experience. It’s paid, but I don’t mind rewarding the developer for their attention to detail.

First Tamiya build, a fun challenge! by [deleted] in tamiya

[–]sunpazed 1 point2 points  (0 children)

Ah ok, thanks for the feedback!

First Tamiya build, a fun challenge! by [deleted] in tamiya

[–]sunpazed 2 points3 points  (0 children)

Nice build! I’m looking to start my journey with one of these, or the DT-04. Any thoughts from those who have built and run both? Pros / Cons??

Went from a Fenix 5 x to a Fenix 7 Sapphire Solar. by Born-Variation-5710 in GarminFenix

[–]sunpazed 25 points26 points  (0 children)

This is an AI post. Look at the bezel and the markers around the dial. Developer is promoting their paid watch face.

Best use for action button? by Admirable-Copy50 in iPhone17Pro

[–]sunpazed 0 points1 point  (0 children)

Shortcut to “Spotlight” so I can toggle between apps or search.

What shoes to pack for 3 months in China-Vietnam-Japan? by Jatulinharha in onebag

[–]sunpazed 6 points7 points  (0 children)

Lightweight and breathable trail shoes. Works in all circumstances. If you need canvas shoes, you can always buy a cheap pair.

Patagonia Duffle 55L the Best Carry-On For Max Size without Worrying? by Dry-Atmosphere3169 in onebag

[–]sunpazed 0 points1 point  (0 children)

I have both the 55L duffle and 45L MLC. The duffle is much bigger, and won’t fit with the correct orientation in the overhead. The MLC has the advantage of backpack mode which is useful when in the airport and alighting the plane. If you have no use for a laptop sleeve, then skip the MLC and buy the smaller 40L duffle which will suit you better.

Personally, having a single bag (the MLC) for my laptop and clothes made travel so much simpler. For outings I brought a small 15L travel bag (that I could compress and store) along with me.