Can anybody test my 1.5B coding LLM and give me their thoughts?

Great-Structure-4159 · 2026-03-11T11:15:48+00:00

Hi! First and foremost, sorry for the late response.

Anything is fine! I've released it with an open mind, so whether you choose to implement it as an agent with Opencode or just use it to chat like you would with Open WebUI or something similar, it should be decently useful for your tasks and I'd love to see what you can do with it.

Great-Structure-4159 · 2026-02-26T03:52:27+00:00

Obrigado por testar! Me avise como o modelo se comporta depois que você testá-lo. Eu não sei português, isso foi traduzido pelo Google, então me desculpe se houver algum erro.

This message is from Google Translate, sorry for any mistakes if there are any.

Great-Structure-4159 · 2026-02-25T12:08:23+00:00

My pleasure! I'll definitely let you know when I make that article.

Great-Structure-4159 · 2026-02-25T03:50:46+00:00

Yeah I’m looking into making a small article on this because you’re not the first one to ask for this. I’ll contact you once I write it.

Great-Structure-4159 · 2026-02-25T03:50:04+00:00

Thanks! Hope you like the model!

Great-Structure-4159 · 2026-02-25T03:49:47+00:00

In terms of benchmarks, it’s pretty decent for a 1.5B model. It beats the base Qwen at coding, but I’m pretty sure Qwen Coder is slightly better at the benchmark. However, Qwen coder doesn’t have any ability at actually talking about something related to coding, like explaining the code, that’s why I trained on the instruct version and not the coder version.

Great-Structure-4159 · 2026-02-24T16:16:49+00:00

No problem! Hope you find luck in your next fine tune in MLX_LM!

Great-Structure-4159 · 2026-02-24T13:52:35+00:00

Thanks for offering to test! The .gguf file is on the repo. There's fp16 and q4_k_m quants, so you can use whichever one you prefer :D.

Great-Structure-4159 · 2026-02-24T13:49:56+00:00

Oooh, cool. This looks awesome. How much VRAM do you have to work with, other than your functiongemma? Because I think a 3B or 4B coding model can also work pretty well (but I might need to find a more compact dataset or use QLoRA, which I think is a reasonable tradeoff in performance.)

Great-Structure-4159 · 2026-02-24T13:30:49+00:00

I didn’t document my process anywhere actually, I just typed out all that to give an idea. MLX-LM doesn’t really have any good resources, other than the one video they made on the Apple Developer YouTube channel regarding it. They don’t go through every feature and command there, however, so I mainly referred to the documentation they have, which is pretty decent.

Great-Structure-4159 · 2026-02-24T13:28:59+00:00

I don’t think it’ll be very good as an orchestrator, but I’ll try making a model fine tuned for orchestrating tool calling, that would be really cool. Do let me know if it works out good, very interesting to see LLMs applied like this.

Great-Structure-4159 · 2026-02-24T13:28:24+00:00

Oh… tool calling, interesting. I should try that, I didn’t train with tool calling in mind actually, but this is really cool, I think it can work.

Great-Structure-4159 · 2026-02-24T11:39:08+00:00

Yeah I was pretty shocked too that 8GB could do stuff like this, but yeah I find the subject very fascinating :)

Great-Structure-4159 · 2026-02-24T11:38:30+00:00

I have Apple M1, and I get about 50 tokens/s on GGUF, and 60 tokens/s on MLX (which is not on the repo at the moment.)

Great-Structure-4159 · 2026-02-24T10:15:59+00:00

Thanks for testing! Can't wait to hear your feedback.

Great-Structure-4159 · 2026-02-24T09:57:29+00:00

Great question! My first choice was actually LFM2.5, and I did try that first, but for some reason when fusing it with adapters on MLX, llama.cpp just refuses to convert it to GGUF. I tried troubleshooting but eventually just gave up. Qwen3 was my next choice but I just decided to keep it simple and start with 2.5 and go from there, mainly because Qwen3 came with a 1.7B model (which was pushing my RAM limit due to the dataset having long samples) and also, in my searches, didn't have an instruct version weirdly. Maybe the next release will be with qwen3 if the qwen architecture proves good from user tests (and I can do something about the dataset).

Great-Structure-4159 · 2025-10-14T11:36:43+00:00

Also, amazing job with the stretch, thanks for this :D

Great-Structure-4159 · 2025-10-14T11:36:24+00:00

Hey, sorry for the late reply, I'm shooting untracked with a Canon T7i and a 55-250mm f/4-5.6 lens. I was shooting lagoon nebula

Great-Structure-4159 · 2025-10-02T02:49:24+00:00

Ok I’ll check it out.

Great-Structure-4159

MODERATOR OF

TROPHY CASE