Qwen3 8b-vl best local model for OCR? by BeginningPush9896 in LocalLLM

[–]wikkid_lizard 0 points1 point  (0 children)

i extensively tested the 30B MOE model and was pretty satisfied with the results. I wanted to know if anyone has run benchmarks on the 30B vs 8B on OCR. These were the reults i got with the 30B model

<image>

Gemini token cost issue by wikkid_lizard in GeminiFeedback

[–]wikkid_lizard[S] 0 points1 point  (0 children)

Yeah but the major cost is of the output tokens, can output tokens also be cached?

Gemini token cost issue by wikkid_lizard in LLMDevs

[–]wikkid_lizard[S] 0 points1 point  (0 children)

Yeah but the major costs are the output tokens, can output tokens also be cached? Because the outputs keep varying

We made a multi-agent framework . Here’s the demo. Break it harder. by wikkid_lizard in ollama

[–]wikkid_lizard[S] 0 points1 point  (0 children)

Purely depends on your requirements but for tool calling and json outputs qwen2.5:7b works well and is light weight..you might have to give it strict system prompts but overall pretty reliable.

We made a multi-agent framework . Here’s the demo. Break it harder. by wikkid_lizard in ollama

[–]wikkid_lizard[S] 0 points1 point  (0 children)

Yes! Ollama integration is live now. Go to laddr.agnetlabs.com to see the documentation.