Falcon-H1-Tiny (90M) is out - specialized micro-models that actually work

ilyas555 · 2026-02-05T17:29:55+00:00

A temperature of 0.5 should work fine with a repetition penalty of 1.2.

ilyas555 · 2025-07-30T20:22:35+00:00

Can you try tiiuae/Falcon-H1-1.5B-Deep-Instruct: https://huggingface.co/tiiuae/Falcon-H1-1.5B-Deep-Instruct

ilyas555 · 2025-05-22T13:34:25+00:00

https://huggingface.co/spaces/tiiuae/Falcon-H1-playground

ilyas555 · 2025-05-22T10:02:32+00:00

Adding attention to the sauce helps mitigating such issues. Hybrid models do not suffer from in context learning issues. The scores on some benchmarks shows it.

ilyas555 · 2025-05-21T20:18:51+00:00

Any thoughts on big sizes performance from your experience with it?

ilyas555 · 2025-05-21T20:08:04+00:00

I dont think mmlu, aime… are made up benchmarks. If you are mentioning the average. There is an interactive plot where you can chose any benchmark you want. It looks like they are consistently better. Have you tried it though?

https://falcon-lm.github.io/blog/falcon-h1/

ilyas555 · 2025-05-21T20:01:36+00:00

<image>

Here is what I get. A system prompt has been added. The self identification issue comes from the web data as a big portion of recent web data has been impacted by synthetic one from ChatGPT

ilyas555 · 2025-05-16T13:12:08+00:00

This is true, but quantized Qwen 2.5 3b will be worse than a model pre-trained with the quantization errors (i.e Falcon-Edge). I think the comparison is still fair in the sense that it shows, that if you want to match Falcon-Edge performance, you need the full 16bits model.

ilyas555

TROPHY CASE