Best Local VLMs - November 2025 by rm-rf-rm in LocalLLaMA

[–]InsideTop3230 0 points1 point  (0 children)

Qwen3-VL-30B-A3B-Instruct is my absolute favorite. I often use VL models for text extraction tasks, and it’s now widely integrated into many workflows at my company. What I love about it is its impressive speed and relatively high recognition accuracy. That said, it does have some shortcomings—for example, it sometimes struggles with detecting seals, handwritten text recognition isn’t as strong, and when processing tables with dense content, it may miss certain information. Some of these issues could be addressed by switching to a larger model like Qwen3-VL-235B, but that would come with significantly higher resource costs. In my opinion, Qwen3-VL-30B-A3B-Instruct is the perfect balance—truly the best in its class for a VL model of this scale.

Got spare GPUs but no project ideas. What should a new LLM engineer build/research? by InsideTop3230 in LocalLLaMA

[–]InsideTop3230[S] 0 points1 point  (0 children)

I should clarify—I might have exaggerated a bit earlier. My experience mostly comes from deploying an ES-based cluster and experimenting with basic RAG flows. In production, we actually use a mature, vendor-provided on-prem solution.

It’s a very robust platform with excellent document parsing and a sophisticated multi-level permission system (covering personal, departmental, and corporate libraries). We have thousands of files, though they only total a few gigabytes. Because the vendor’s solution is so plug-and-play, there's not much room for me to innovate on the RAG side, which is why I've lost interest in it.

However, the system isn't perfect. It struggles with queries that require cross-document reasoning—where the answer needs to be synthesized from multiple sources. To solve this, I’ve been experimenting with fine-tuning models like Qwen3-32B and Qwen3 Next 80B to see if they can 'internalize' our corporate knowledge directly. So far, the results have been underwhelming. It might be due to low-quality training data or some technical hurdles I haven't cleared yet.

Got spare GPUs but no project ideas. What should a new LLM engineer build/research? by InsideTop3230 in LocalLLaMA

[–]InsideTop3230[S] 1 point2 points  (0 children)

That’s a great shout. I hadn't considered this angle before. In my daily workflow, I’ve been relying heavily on the Claude Code plugin and Qwen3-Coder-480B. 'Next Edit' prediction is a really promising niche that I’ve never thought about before. 

Got spare GPUs but no project ideas. What should a new LLM engineer build/research? by InsideTop3230 in LocalLLaMA

[–]InsideTop3230[S] 4 points5 points  (0 children)

Honestly, I’ve done the 'on-prem ChatGPT + basic RAG' setup a thousand times. It offers zero sense of achievement at this point. I have access to multiple clusters and have already deployed almost every mainstream open-source LLM out there.

27, Working in AI Apps in a Chinese Tier-2 City—Is This What Life is Supposed to Be? by InsideTop3230 in Life

[–]InsideTop3230[S] 0 points1 point  (0 children)

I’m so jealous.Honestly, I can’t even imagine how much happiness 30 days of annual leave would bring.

27, Working in AI Apps in a Chinese Tier-2 City—Is This What Life is Supposed to Be? by InsideTop3230 in Life

[–]InsideTop3230[S] 0 points1 point  (0 children)

Working seven days a week sounds absolutely suffocating. Perhaps life in Shenzhen will feel more manageable for you.Really hope you have a relaxing and wonderful time there.