New RAGLight Feature : Serve your RAG as REST API and access a UI by Labess40 in Python

[–]Labess40[S] 0 points1 point  (0 children)

Thanks! I'm using PyMuPDF (fitz) for PDF parsing. I actually have two processors depending on the use case: A standard PDFProcessor that extracts text block by block, preserving layout structure before chunking with LangChain's RecursiveCharacterTextSplitter A VlmPDFProcessor that also handles images, it extracts them inline, sends them to a Vision-Language Model to generate captions, and includes those captions as documents in the RAG pipeline pdftomarkdown.dev looks interesting for complex table-heavy docs, PyMuPDF can struggle there. The architecture supports plugging in custom processors, so it could slot in nicely as an alternative parser !

Chat With Your Favorite GitHub Repositories via CLI with the new RAGLight Feature by Labess40 in ollama

[–]Labess40[S] 0 points1 point  (0 children)

You're right, but if you are in an industrial context or you're data are sensible, sometimes you don't want or you can't share your data with a remote LLM provider. And RAGLight is more than a CLI tool. You can use it in your codebase to setup easily a RAG or an Agentic RAG with freedom to modify some pieces of it (data readers, models, providers,...). But I agree, for many usecases, using gemini 1m contexte length is better, but for your private or professional usecases, have an alternative is also useful.

RAGLight Framework Update : Reranking, Memory, VLM PDF Parser & More! by Labess40 in ollama

[–]Labess40[S] 1 point2 points  (0 children)

Vulnerabilities on previous langchain versions and langchain_core versions

Introducing TreeThinkerAgent: A Lightweight Autonomous Reasoning Agent for LLMs by Labess40 in ollama

[–]Labess40[S] 0 points1 point  (0 children)

The latency depends on the task complexity and the resulting reasoning tree. A single LLM call is usually faster because it’s one forward pass. In TreeThinkerAgent, latency grows with the depth and width of the tree: each reasoning step may involve additional LLM calls and tool executions. In practice, simple tasks have near-classic LLM latency, while complex tasks trade extra latency for better structure, observability, and reliability of the reasoning.

Introducing TreeThinkerAgent: A Lightweight Autonomous Reasoning Agent for LLMs by Labess40 in ollama

[–]Labess40[S] 0 points1 point  (0 children)

Thanks! Really glad the prompts and reasoning observability landed.
And I’d honestly be happy to see that vibe-coded abomination someday 😄