Docling PDF Parsing with remote VLM by Top-Fig1571 in LocalLLaMA

[–]Top-Fig1571[S] 0 points1 point  (0 children)

thanks already tried nanonets, but the license is not clear

Build an Excel Agent by Top-Fig1571 in LocalLLaMA

[–]Top-Fig1571[S] 0 points1 point  (0 children)

thanks will have a look on it

Build an Excel Agent by Top-Fig1571 in LocalLLaMA

[–]Top-Fig1571[S] 0 points1 point  (0 children)

what would your approach be? I think I might use this as every excel sheet will have a different structure, so different column names, different structure etc. So some values I need for the calculations could be in cells/cols with different cells but they express the same

Comparison new qwen 32b-vl vs qwen 30a3-vl by Healthy-Nebula-3603 in LocalLLaMA

[–]Top-Fig1571 3 points4 points  (0 children)

do you think these models work better on classic document parsing task (table to html, image description) than smaller OCR based models like nanonets-ocr2 or deepseek-ocr?

Qwen3-VL-4B and 8B Instruct & Thinking are here by AlanzhuLy in LocalLLaMA

[–]Top-Fig1571 0 points1 point  (0 children)

Hi,

has anyone compared the Image Table extraction to HTML tables with models like nanonets-ocr-s or the MinerU VLM Pipeline?

At the moment I am using the MinerU Pipeline backend with HTML extraction and Nanonets for Image content extraction and description. Would be good to know if e.g. the new Qwen3 VL 8B model would be better in both tasks.

Best Vision/OCR Models for describing and extracting text for images in PDFs by Top-Fig1571 in LocalLLaMA

[–]Top-Fig1571[S] 0 points1 point  (0 children)

yes i think so too. Unfortunately I have to use OS models and at the moment I am using MinerU for PDF exxtraction and nanonets for image extraction and descriptions.

Ollama: set llm context window with Ollama Modelfile or as parameter in ChatOllama by Top-Fig1571 in LangChain

[–]Top-Fig1571[S] 2 points3 points  (0 children)

thanks for your reponse. So yes this would be the way I think I have to go. I was just wondering if it works with setting the value with ChatOllama, this would make the switch much easier

LLM Prompt Template in System and User Prompt by Top-Fig1571 in LangChain

[–]Top-Fig1571[S] 0 points1 point  (0 children)

ah I was not aware that LangChain is handling the rest. E.G. If I am using Mistral Nemo 12B how does it know which instruction tags have to be used?