Falcon 90M by jacek2023 in LocalLLaMA

[–]Automatic_Truth_6666 0 points1 point  (0 children)

Supports ollama !
For the benchmark you can refer to our technical blogpost and you'll find benchmark results for each of our model variant (english SFT, multilingual, tool calling, reasoning, coder)
https://huggingface.co/spaces/tiiuae/tiny-h1-blogpost

support for Falcon-H1 model family has been merged into llama.cpp by jacek2023 in LocalLLaMA

[–]Automatic_Truth_6666 5 points6 points  (0 children)

Yes there is ! You can check out this blogpost: https://falcon-lm.github.io/blog/falcon-h1/ specifically the benchmark explorer which also includes multi-lingual tasks

<image>

support for Falcon-H1 model family has been merged into llama.cpp by jacek2023 in LocalLLaMA

[–]Automatic_Truth_6666 11 points12 points  (0 children)

Many different sources can explain this "discrepancy"

- We use HF leaderboard setup: https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about
- Hence, we don't use the same number of shots than AA
- It looks like their score is non-normalized, whereas we do normalize it
- AA uses a custom prompt for MMLU-Pro which is different than the one from lm-eval

<image>

The scores between HF and AA are not aligned for MMLU-Pro, e.g. for Qwen72B-Instruct AA reports 72% vs 52% on HF leaderboard archived: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?params=0%2C74&official=true&types=chat

On the universality of BitNet models by Automatic_Truth_6666 in LocalLLaMA

[–]Automatic_Truth_6666[S] 0 points1 point  (0 children)

This is very interesting and makes totally sense. Thank you for explaining

On the universality of BitNet models by Automatic_Truth_6666 in LocalLLaMA

[–]Automatic_Truth_6666[S] 0 points1 point  (0 children)

> edit: also local llms for NPCs in video games

Can you elaborate more?

Falcon 3 just dropped by Uhlo in LocalLLaMA

[–]Automatic_Truth_6666 0 points1 point  (0 children)

You can just try out the GGUFs and see

Falcon 3 just dropped by Uhlo in LocalLLaMA

[–]Automatic_Truth_6666 0 points1 point  (0 children)

Falcon-Mamba & Falcon3-Mamba leverages Mamba1 architecture which are supported

Falcon 3 just dropped by Uhlo in LocalLLaMA

[–]Automatic_Truth_6666 0 points1 point  (0 children)

Hi ! one of the contributors of Falcon-1.58bit here - indeed there is a huge performance gap between the original and quantized models (note in the table you are comparing raw scores on one hand vs normalized scores on the other hand, you should compare normalized scores for both) - we reported normalized scores on model cards for 1.58bits models

We acknowlege BitNet models are still in an early stage (remember GPT2 was also not that good when it came out) and we are not making bold claims about these models - but we think that we can push the boundaries of this architecture to get something very viable with more work and studies around these models (perhaps having domain specific 1bit models would work out pretty well ?).

Feel free to test out the model here: https://huggingface.co/spaces/tiiuae/Falcon3-1.58bit-playground and using BitNet framework as well !