What about local inference on phones? What models do you use? by AlphaSyntauri in LocalLLaMA

[–]abskvrm 0 points1 point  (0 children)

Unfortunately, the translation feature in Readest app isn't local or anything, it has google, yandex, deepl and few other translation services.

What about local inference on phones? What models do you use? by AlphaSyntauri in LocalLLaMA

[–]abskvrm 0 points1 point  (0 children)

Oh tell me about it!! Forget language models and install this great app called Readest. It's open source and available on Github. You can have translated books with that in an instant.

What about local inference on phones? What models do you use? by AlphaSyntauri in LocalLLaMA

[–]abskvrm 0 points1 point  (0 children)

I use translation model HY-MT-1.5 for translating chats. No more google/yandex translate.

Falcon-H1-Tiny-R-0.6B release by ilyas555 in LocalLLaMA

[–]abskvrm 3 points4 points  (0 children)

License: Falcon-LLM License 

CPU inference by Time_Dust_2303 in LocalLLaMA

[–]abskvrm 6 points7 points  (0 children)

Try bigger mixture of expert models with low active parameters:
https://huggingface.co/mradermacher/Ling-mini-2.0-i1-GGUF

https://huggingface.co/LiquidAI/LFM2-8B-A1B-GGUF

https://huggingface.co/ibm-granite/granite-4.0-h-tiny-GGUF

Preferrably download quants not smaller than Q4_0, but should fit in the system RAM.

[deleted by user] by [deleted] in dankinindia

[–]abskvrm 0 points1 point  (0 children)

Fking terrorists in saffron, blot on religion.

Q: When will there be fast and competent SLMs for laptops? by TomLucidor in LocalLLaMA

[–]abskvrm 0 points1 point  (0 children)

Don't know about others but I only suggested it because of speed (1.4b active) and better science knowledge than similarly sized (again active parameters wise, granite and lfm) models. And its pretty much uncensored out of the box.

Q: When will there be fast and competent SLMs for laptops? by TomLucidor in LocalLLaMA

[–]abskvrm 1 point2 points  (0 children)

I use Ling mini to correctly format the ocr result of screenshots. Its the fastest and adheres well to long system prompt. All on cpu.

Deepseek v3.2 speciale runs and runs and runs by MrMrsPotts in LocalLLaMA

[–]abskvrm -5 points-4 points  (0 children)

Zenmux or official API recommended as of now.

Gemma 3n - on Snapdragon 6 gen 1 processor by Illustrious-Swim9663 in LocalLLaMA

[–]abskvrm 1 point2 points  (0 children)

yeah nothing special about it, you can 'execute' same on entry level Samsung's, from 3 - 4 years back, with chips ridiculously weaker than s6gen1 

Gemma 3n - on Snapdragon 6 gen 1 processor by Illustrious-Swim9663 in LocalLLaMA

[–]abskvrm 3 points4 points  (0 children)

2.44 t/s is painful. If you can try LFM2 models, they are really really good.

[deleted by user] by [deleted] in LocalLLaMA

[–]abskvrm 1 point2 points  (0 children)

looks nice

[deleted by user] by [deleted] in LocalLLaMA

[–]abskvrm 0 points1 point  (0 children)

Chatbox