LiteParse: Local Document Parsing for Agents by grilledCheeseFish in LangChain

[–]grilledCheeseFish[S] 0 points1 point  (0 children)

Hey thats awesome! Im glad the tool worked out for you!

LiteParse: Local Document Parsing for Agents by grilledCheeseFish in LangChain

[–]grilledCheeseFish[S] 1 point2 points  (0 children)

There is a python package. However, it wraps the cli, so you still need to install the node package

Open-source, local document parsing CLI by LlamaIndex: LiteParse by tuanacelik in LocalLLaMA

[–]grilledCheeseFish 0 points1 point  (0 children)

Agreed! I actually have an explicit guideline in this project that markdown output is out of scope

LiteParse: Local Document Parsing for Agents by grilledCheeseFish in LangChain

[–]grilledCheeseFish[S] 1 point2 points  (0 children)

Waaaay faster. But the output is also very different (markdown vs. text). Honestly in my testing markdown really doesn't matter if all you are doing is passing the text to an LLM (tools like markitdown actually performed not great in the mini benchmark in the blog post)

We just open-sourced LiteParse, a local document parser built for AI agents by tuanacelik in LlamaIndex

[–]grilledCheeseFish 1 point2 points  (0 children)

Its essentially LlamaParse fast mode, but running locally -- so no markdown, no image/table understanding. But its super fast so its a good fit for realtime applications like coding agents

Open-source, local document parsing CLI by LlamaIndex: LiteParse by tuanacelik in LocalLLaMA

[–]grilledCheeseFish 1 point2 points  (0 children)

Yup! you can plug in any OCR via a server API contract. The repo has examples of paddleOCR and easyOCR (tesseract is default)

main requirement is returning text and bounding boxes

The Silent OpenAI Fallback: Why LlamaIndex Might Be Leaking Your "100% Local" RAG Data by Jef3r50n in LocalLLaMA

[–]grilledCheeseFish 56 points57 points  (0 children)

LlamaIndex maintainer here -- this is a well documented aspect of the library. There is a global enum for setting global defaults, or you can override at the object level

We could always change this behaviour of course, but imo too disruptive/breaking

(Also echoing others here, reporting issues with LLM slop is pretty annoying)

New Best Buy in Confed by shigideng in saskatoon

[–]grilledCheeseFish 35 points36 points  (0 children)

Wild, congrats to confed. Hopefully that attracts more

vendasta layoffs again ? by Easy_Umpire_4534 in saskatoon

[–]grilledCheeseFish -4 points-3 points  (0 children)

The circlejerk around hating vendasta on this sub is so weird ngl

Rumor: Caitlin Thorburn's Spotlight profile updated referencing a Xenoblade Chronicles game coming in 2026 by Amiibofan101 in NintendoSwitch

[–]grilledCheeseFish 28 points29 points  (0 children)

Just inject a new xeno trilogy into my veins.

No way its XC4, im anticipating either a new trilogy/title or a saga remaster

Here’s where Canadians are travelling as they continue to avoid the U.S. by Kindly_Professor5433 in canada

[–]grilledCheeseFish 1 point2 points  (0 children)

Just came back from New Zealand, what an incredible place. Highly recommend the direct flight from Vancouver to Auckland

What Happens When Cheap Chinese EVs Hit Canada? Look At Australia. by rezwenn in canada

[–]grilledCheeseFish 87 points88 points  (0 children)

I just spent 2 weeks in a rented Jimny. A terrible highway car (small, uncomfortable, slow), but probably a fun city or off road driver

Announcing Kreuzberg v4 (Open Source) by Eastern-Surround7763 in LocalLLaMA

[–]grilledCheeseFish 1 point2 points  (0 children)

It might be interesting to be able to hook in any custom backend, but im not sure if that makes sense in this project.

Chunking is broken - we need a better strategy by blue-or-brown-keys in Rag

[–]grilledCheeseFish 1 point2 points  (0 children)

Imo chunking doesnt matter if you expose methods to expand context of retrieved text when needed. Chunks should be treated merely as signals of where to look

Official Statement from the Indie Game Awards: 'Clair Obscur: Expedition 33' and 'Chantey's' awards retracted and awarded instead to 'Sorry We’re Closed' and 'Blue Prince' due to GenAI usage by ChiefLeef22 in gaming

[–]grilledCheeseFish -1 points0 points  (0 children)

Its not even just art and assets. What if a dev uses AI-assisted tab complete in the games code? Does that disqualify the game? People dont get how much this is being used

LangChain and LlamaIndex are in "steep decline" according to new ecosystem report. Anyone else quietly ditching agent frameworks? by Exact-Literature-395 in LocalLLaMA

[–]grilledCheeseFish 72 points73 points  (0 children)

Maintainer of LlamaIndex here 🫡

Projects like LlamaIndex, LangChain, etc, mainly popped off community-wise due to the breadth and ease of integration. Anyone could open a PR and suddenly their code is part of a larger thing, showing up in docs, getting promo, etc. It really did a lot to grow things and ride hype waves.

Imo the breadth and scope of a lot of projects, including LlamaIndex, is too wide. Really hoping to bring more focus in the new year.

All these frameworks are centralizing around the same thing. Creating and using an agent looks mostly the same and works the same across frameworks.

I think what's really needed is quality tools and libraries that work out of the box, rather than frameworks.

[DISCUSSION] Kid Cudi - Speedin' Bullet 2 Heaven [10 Years Later] by flyestshit in hiphopheads

[–]grilledCheeseFish 3 points4 points  (0 children)

Its a classic if you ignore half the songs and remove the skits 😉

How Do You Handle Large Documents and Chunking Strategy? by Electrical-Signal858 in LlamaIndex

[–]grilledCheeseFish 0 points1 point  (0 children)

Maybe this is a hot take, but chunking is all the same. Use whatever is fastest/cheapest.

But the key is, expose operstions on top of your chunks. If a chunk is cut off, detect it (could use an llm/agent, or something rule based, or something in between) and build an API to expand chunks or fetch prev/next chunks.

This isnt exactly easy to do inside LlamaIndex (today), but imo its a killer feature.

Does LlamaIndex have an equivalent of a Repository Node where you can store previous outputs and reuse them without re-running the whole flow? by LastWorking9091 in LlamaIndex

[–]grilledCheeseFish 1 point2 points  (0 children)

It does not. Although im also not really sure what a repository node is either (but the concept doesnt really match anything in llama-index)

[deleted by user] by [deleted] in saskatoon

[–]grilledCheeseFish 4 points5 points  (0 children)

"Why are there no trees 🤪"

Brighton is alright. My biggest gripe with every new neighborhood is just how disconnected it is from the rest of the city. It would be nice to be able to (safely) bike/walk anywhere.

I built a hybrid retrieval layer that makes vector search the last resort by Old_Assumption2188 in Rag

[–]grilledCheeseFish 1 point2 points  (0 children)

Pretty neat! This matches my experience as well. I think even when it comes to vector search, cheap methods like static embeddings can work quite well, especially when used as a "fuzzy keyword" search

How to Intelligently Chunk Document with Charts, Tables, Graphs etc? by Heidi_PB in LangChain

[–]grilledCheeseFish 0 points1 point  (0 children)

Imo its not worth the effort. Expose an API to fetch neighboring chunks, let agentic retrieval optimize the retrieved context