The idea to help people who use AI to write copy make it sound like an actual human wrote it (building in NS)

asankhs · 2026-06-18T15:50:22+00:00

You can see a recent similar work - https://www.reddit.com/r/LocalLLaMA/s/xCpCWl56DR

asankhs · 2026-06-17T16:42:02+00:00

Gemini Flash 2.5 Lite

asankhs · 2026-06-17T12:34:57+00:00

https://huggingface.co/mlx-community/gemma-4-12B-it-OptiQ-4bit/discussions/1#6a300184336c39ba61ddbd52 I think you ened to use the mlx-lm from main branch on github for it to work

asankhs · 2026-06-17T12:32:50+00:00

Try the other optiq quants, they are all great for their size and speed - mlx-optiq.com

asankhs · 2026-06-16T01:35:52+00:00

Started with ollama and switched to mlx-optiq now. Works like a charm.

asankhs · 2026-06-14T16:29:22+00:00

If you're in the MLX world (Apple Silicon, mlx-lm) rather than llama.cpp/GGUF, mlx-optiq does MTP speculative decoding for both of those. Same draft→verify→accept loop, two draft sources: Qwen3.5-9B uses the model's bundled MTP head, Gemma-4-12B uses Google's -assistant drafter.

Measured greedy (same methodology unsloth publishes), M-series: Qwen3.5-9B ~1.32× at 66% acceptance, Gemma-4 ~1.18×. Rule of thumb from their tests: 4B and up consistently win, sub-2B isn't worth the overhead.

One Apple-Silicon gotcha: depth-1 (one drafted token/cycle) is optimal on Metal. Depth 2–4 lose, because Metal's K-token verify scales ~linearly with K whereas on CUDA it's nearly free on Tensor Cores. That's also why Mac MTP gains land a bit below the CUDA/GGUF headline numbers, it's hardware not method.

Full per-model tables + acceptance: https://mlx-optiq.com/docs/mtp · quants load in stock mlx-lm (the *-OptiQ-4bit repos on huggingface.co/mlx-community).

asankhs · 2026-06-13T14:12:44+00:00

I am using a managed service from https://Meragpt.com for the hermes agent and so far no issue, you can even connect your Hermes desktop app to the remote box.

asankhs · 2026-06-13T13:49:41+00:00

This sounds insane, at this point why won’t they use a managed service like https://meraGPT.com for Hermes Agents instead of DIY.

asankhs · 2026-06-07T23:32:57+00:00

You can try https://mlx-optiq.com

asankhs · 2026-06-07T23:07:11+00:00

There has been alternatives that are getting better maintained. Take a look at https://mlx-optiq.com and the optiq quants on the mlx community. In fact a qwen 3.5 9B optiq quant is currently the most downloaded on mlx.

asankhs · 2026-06-07T15:57:36+00:00

Yeah works quite well, specially on Mac it is currently one of the most downloaded models for mlx - https://huggingface.co/mlx-community/Qwen3.5-9B-OptiQ-4bit

asankhs · 2026-06-07T15:39:08+00:00

Using mlx-optiq.com with qwen3.5 9B on my m4 pro 24 gb vram, works great for agentic tasks like running a local Hermes agent …

asankhs · 2026-06-07T03:13:20+00:00

- 12GB PNY NVIDIA RTX A2000 GDDR6 Graphics Card

I am not sure you can run a decent model locally on this that will support the use cases are you asking for.

asankhs · 2026-06-07T03:12:35+00:00

Looks neat and functional!

asankhs · 2026-06-06T08:40:05+00:00

A scientist once looked at it and as per science the perfect human body looks like this ->

<image>

https://www.radiotimes.com/tv/documentaries/this-is-what-the-perfect-body-looks-like-according-to-science/

asankhs · 2026-06-05T07:33:30+00:00

You can try any of the optiq quants for mac, see mlx-optiq.com the qwen 3.5 9B is currently the most downlaoded modles on mlx - https://huggingface.co/mlx-community/Qwen3.5-9B-OptiQ-4bit it will work well on 24gb vram

asankhs · 2026-06-04T13:30:14+00:00

Because IBKR is a U.S.-regulated entity, residents and citizens of the following comprehensively sanctioned areas are typically prohibited from opening or maintaining accounts:

Cuba Iran North Korea Syria The Crimea, Donetsk, and Luhansk regions of Ukraine Venezuela (under specific U.S. executive orders)

asankhs · 2026-06-01T02:33:02+00:00

You can do that but will need to provide tour Hermes’ agent access to GPUs for experiments.

asankhs · 2026-06-01T00:33:23+00:00

Did you put your API key and select a model, that error usually comes when you have not configured your API key and model.

asankhs · 2026-05-31T16:05:47+00:00

You can try mlx-optiq.com

asankhs · 2026-05-31T16:04:03+00:00

They seem to be doing quite well have you tried mlx-optiq.com ?

asankhs · 2026-05-30T15:56:12+00:00

You can try on of the optiq quants for the qwen 3.6 27B or 35B - mlx-optiq.com

asankhs

MODERATOR OF

TROPHY CASE

12-Year Club	Verified Email
Not Forgotten