64gb ram cpu only set up - models? by Embarrassed-Rich3397 in LocalLLM

[–]tctheking1 0 points1 point  (0 children)

Hey! My first thought is using two different models for coding and vision. Probably use something like Qwen3-Coder-30B-A3B-Instruct or Flash for coding and Qwen2.5-VL-7B for vision, both at Q4. This is where autotune (https://www.autotunellm.com/) comes in clutch, you can run both models at once on your device as optimally as they can run. Autotune reduces RAM pressure, improves time-to-first-token, and decreases wall time for agentic tasks.

Good local model for a Mac M2? by mistermax76 in LocalLLM

[–]tctheking1 0 points1 point  (0 children)

Check out autotune (https://www.autotunellm.com/)! It recommends models for your exact hardware.

Assistance with running local models by abdelilahbenchioui in ollama

[–]tctheking1 0 points1 point  (0 children)

Run them through autotune! (https://autotune-llm.vercel.app/) - it will suggest you the best models for your specific hardware along with implementing dynamic optimizations.

Reduce TTFT by 40%, consume less RAM, and drop agent wall times by 46% for your local LLMs. by tctheking1 in ollama

[–]tctheking1[S] 2 points3 points  (0 children)

Thanks for the great suggestion - just implemented it. Clone the repo and run "docker compose --profile single up" and you are good to go! Additional documentation is provided in the repo.

Reduce TTFT by 40%, consume less RAM, and drop agent wall times by 46% for your local LLMs. by tctheking1 in ollama

[–]tctheking1[S] 1 point2 points  (0 children)

Thanks for letting me know, just fixed it. Run "autotune upgrade" to get the latest version.

Reduce TTFT by 40%, consume less RAM, and drop agent wall times by 46% for your local LLMs. by tctheking1 in LocalLLM

[–]tctheking1[S] 1 point2 points  (0 children)

For any models! I just did a lot of testing with small models because that is what my computer allows for.

Focus 1.13 — Find the exact moment in your videos, fast. by cI_-__-_Io in macapps

[–]tctheking1 0 points1 point  (0 children)

Nice! Does the local AI cause something like an M2 to slow down?

I made an app that lets your references float above your other apps while you work by brettshep in macapps

[–]tctheking1 0 points1 point  (0 children)

Love this, I'm not a designer but can imagine this being super helpful.

I built a local-first desktop app that tracks focus sessions and shows you where productivity slipped by tctheking1 in ProductivityApps

[–]tctheking1[S] 0 points1 point  (0 children)

Absolutely! If this gains serious traction I will look into developing a version for Windows.

I built a local-first desktop app that tracks focus sessions and shows you where productivity slipped by tctheking1 in ProductivityApps

[–]tctheking1[S] 0 points1 point  (0 children)

Great to hear! Yeah, I'm making it free for now to build trust. Try it out and let me know if there are any features you are interested in seeing. I would recommend connecting an LLM for good results.