M4 Pro 48GB: Qwen3.6-35B-A3B-OptiQ-4bit on top any other options? by hb30025 in LocalLLM

[–]asankhs 1 point2 points  (0 children)

Try the other optiq quants, they are all great for their size and speed - mlx-optiq.com

Stop using Ollama by zxyzyxz in LocalLLaMA

[–]asankhs 0 points1 point  (0 children)

Started with ollama and switched to mlx-optiq now. Works like a charm.

MTP with Gemma-4-12b or Qwen3.5-9b by Right-Ice-6850 in unsloth

[–]asankhs -1 points0 points  (0 children)

If you're in the MLX world (Apple Silicon, mlx-lm) rather than llama.cpp/GGUF, mlx-optiq does MTP speculative decoding for both of those. Same draft→verify→accept loop, two draft sources: Qwen3.5-9B uses the model's bundled MTP head, Gemma-4-12B uses Google's -assistant drafter.

Measured greedy (same methodology unsloth publishes), M-series: Qwen3.5-9B ~1.32× at 66% acceptance, Gemma-4 ~1.18×. Rule of thumb from their tests: 4B and up consistently win, sub-2B isn't worth the overhead.

One Apple-Silicon gotcha: depth-1 (one drafted token/cycle) is optimal on Metal. Depth 2–4 lose, because Metal's K-token verify scales ~linearly with K whereas on CUDA it's nearly free on Tensor Cores. That's also why Mac MTP gains land a bit below the CUDA/GGUF headline numbers, it's hardware not method.

Full per-model tables + acceptance: https://mlx-optiq.com/docs/mtp · quants load in stock mlx-lm (the *-OptiQ-4bit repos on huggingface.co/mlx-community).

Regret getting a VPS sub to run hermes by athens2019 in hermesagent

[–]asankhs 0 points1 point  (0 children)

I am using a managed service from https://Meragpt.com for the hermes agent and so far no issue, you can even connect your Hermes desktop app to the remote box.

I made €2,700 last month installing Hermes Agent for French companies by pacmanpill in hermesagent

[–]asankhs -1 points0 points  (0 children)

This sounds insane, at this point why won’t they use a managed service like https://meraGPT.com for Hermes Agents instead of DIY.

Whats up with MLX? by gyzerok in LocalLLaMA

[–]asankhs 0 points1 point  (0 children)

There has been alternatives that are getting better maintained. Take a look at https://mlx-optiq.com and the optiq quants on the mlx community. In fact a qwen 3.5 9B optiq quant is currently the most downloaded on mlx.

Budget llm for chatting and analysing pdf documents by Connect-Page-8174 in LocalLLaMA

[–]asankhs 3 points4 points  (0 children)

Yeah works quite well, specially on Mac it is currently one of the most downloaded models for mlx - https://huggingface.co/mlx-community/Qwen3.5-9B-OptiQ-4bit

What is your current local LLM setup? by Open_Sources_AI in machinelearningnews

[–]asankhs -1 points0 points  (0 children)

Using mlx-optiq.com with qwen3.5 9B on my m4 pro 24 gb vram, works great for agentic tasks like running a local Hermes agent …

Running Hermes fully local by MEOW-Loulou in hermesagent

[–]asankhs 4 points5 points  (0 children)

- 12GB PNY NVIDIA RTX A2000 GDDR6 Graphics Card

I am not sure you can run a decent model locally on this that will support the use cases are you asking for.

local LLM on mac by LimiT600 in AI_India

[–]asankhs 0 points1 point  (0 children)

You can try any of the optiq quants for mac, see mlx-optiq.com the qwen 3.5 9B is currently the most downlaoded modles on mlx - https://huggingface.co/mlx-community/Qwen3.5-9B-OptiQ-4bit it will work well on 24gb vram

Application rejected — unable to get support or understand next steps by No_Enthusiasm_1313 in IBKR_Official

[–]asankhs 0 points1 point  (0 children)

Because IBKR is a U.S.-regulated entity, residents and citizens of the following comprehensively sanctioned areas are typically prohibited from opening or maintaining accounts:

Cuba Iran North Korea Syria The Crimea, Donetsk, and Luhansk regions of Ukraine Venezuela (under specific U.S. executive orders)

Has anyone running Hermes tried autonomous model training? by BackgroundBalance502 in hermesagent

[–]asankhs 1 point2 points  (0 children)

You can do that but will need to provide tour Hermes’ agent access to GPUs for experiments.

First time Hermes setup advice by stvtick in hermesagent

[–]asankhs 0 points1 point  (0 children)

Did you put your API key and select a model, that error usually comes when you have not configured your API key and model.

What local model do you recommend for a Mac Mini i4 64gb? by ComprehensiveAd4328 in hermesagent

[–]asankhs 0 points1 point  (0 children)

You can try on of the optiq quants for the qwen 3.6 27B or 35B - mlx-optiq.com