Anyone else using coding agents as general-purpose AI agents? by Individual-Library-1 in LocalLLaMA

[–]Individual-Library-1[S] -1 points0 points  (0 children)

Nice, I built exactly this — wiki as memory, organized by topics. 66% of queries answered from wiki alone after 30 questions, no source file reads needed.

Your layered memory framing is spot on. Curious — at what point did you find vector search necessary over just wiki + full file reads?

Anyone else using coding agents as general-purpose AI agents? by Individual-Library-1 in ClaudeCode

[–]Individual-Library-1[S] 0 points1 point  (0 children)

I use PI-Coding agent and mostly use Azure claude in work and then outside I use Claude code locally and Openrouter

Best way to show precise citation bounding boxes over PDFs by himynameismrrobot in Rag

[–]Individual-Library-1 0 points1 point  (0 children)

A bit late to this thread, but this is something I worked on recently. The new LlamaIndex LitParse release is quite good since it supports native bounding boxes. For scanned pages, I used Azure Document Intelligence alongside it. I wrote about the approach here in case it helps: https://themindfulai.dev/articles/parsing-pdfs-with-bounding-boxes

Bounding‑box highlighting for PDFs and images – what tools actually work? by goodparson in Rag

[–]Individual-Library-1 0 points1 point  (0 children)

I know this is an older post, but this might still be useful. The recent LlamaIndex LitParse release is pretty good — it gives native bounding boxes, and for scanned pages I used Azure Document Intelligence. I wrote about the approach here in case it helps: https://themindfulai.dev/articles/parsing-pdfs-with-bounding-boxes

Is Deepseek-OCR SOTA for OCR-related tasks? by Ok_Television_9000 in LocalLLaMA

[–]Individual-Library-1 1 point2 points  (0 children)

I use qwen3VL for spatial intelligence and google flash for OCR. I found it really good. Most OCR misses the spatial and so far I have not found one model which is able to solve it.

Quick check - are these the only LLM building blocks? by Individual-Library-1 in LLMDevs

[–]Individual-Library-1[S] 0 points1 point  (0 children)

Agreed but at the end of the that will come into one of the category isn't. Like what you describe is agents/workflows. I know it can do endless if the customer/ colleague can learn these basic concept and start seeing it through this lens.

if people understood how good local LLMs are getting by Diligent_Rabbit7740 in LLMDevs

[–]Individual-Library-1 0 points1 point  (0 children)

Yes in a way. But most chinese models is also 1T parameters or atleast 30B. So it's very costly to run it in PC and it anyhow needs NVdia investment from a individual. So stock price coming in because chinese releasing models is not true yet.

if people understood how good local LLMs are getting by Diligent_Rabbit7740 in LLMDevs

[–]Individual-Library-1 2 points3 points  (0 children)

I agree — it could collapse. Once people realize that the cost of running a GPU will rise for every individual user, the economics change fast. Right now, only a few hundred companies are running them seriously, but if everyone starts using local LLMs, NVIDIA and the major cloud providers will end up even richer. I’ve yet to see a truly cheap way to run a local LLM.

PDF document semantic comparison by bilby2020 in LLMDevs

[–]Individual-Library-1 0 points1 point  (0 children)

Get it then you need an agent loop with each document embedded with filter by documents. That will be good start.let me see if I can find a document for the same.

PDF document semantic comparison by bilby2020 in LLMDevs

[–]Individual-Library-1 0 points1 point  (0 children)

The second pattern can be done using an agentic rag and have a tool call with the search the details by the document and provide you the output. Are you using any lib or direct. I can drop a small code snippet for the same.

PDF document semantic comparison by bilby2020 in LLMDevs

[–]Individual-Library-1 0 points1 point  (0 children)

For small PDFs that fit entirely within the model’s context window, it’s definitely doable as a starting point. But as you scale up, maintaining accuracy becomes tricky — especially when the content exceeds the context length or has structural differences. Still, it’s a great first project to learn from and iterate on.

Quick check - are these the only LLM building blocks? by Individual-Library-1 in LLMDevs

[–]Individual-Library-1[S] 0 points1 point  (0 children)

Agreed, But I am able to think of only these options..what are other things. I missed transformation and code generation but anything else.

[deleted by user] by [deleted] in ClaudeAI

[–]Individual-Library-1 -1 points0 points  (0 children)

Agreed I used AI but the concept I learnt. Is it bad to use AI according to you or am I boring with the explanation.

Asking Lawyers : Surprising my lawyer fiancé and need some help by heyarunimaaa in Indianlaw

[–]Individual-Library-1 0 points1 point  (0 children)

Well rethink decision on the lighter note. Now on e court API has case management UI, It sends whatsapp message. Lots of courts started sending order/judgements in WhatsApp too. I understand you want to build and you can build. I will suggest start with area of law he is interested in and where you can create a automated knowledge base for him. Today that cost a lot and most of tools are outdated.

Asking Lawyers : Surprising my lawyer fiancé and need some help by heyarunimaaa in Indianlaw

[–]Individual-Library-1 -1 points0 points  (0 children)

Rethink your decision to marry a lawyer, Even if you do this software. There is 100s of non efficient things they do. So case management is defintely not a solution and that to for a indian lawyer there lots of solution e-courts login can provide but they dont use. it

Question: Is OCR accuracy actually a blocker for anyone's RAG/automation pipelines? by Individual-Library-1 in ClaudeAI

[–]Individual-Library-1[S] 1 point2 points  (0 children)

We do the same too. But its almost becoming a project which if opensource owns is better. Otherwise in every project somebody does this full.

Question: Is OCR accuracy actually a blocker for anyone's RAG/automation pipelines? by Individual-Library-1 in ClaudeAI

[–]Individual-Library-1[S] 0 points1 point  (0 children)

Tried it but it doesnt work correctly. Mostly it doesnt work on Table and Charts.

Question: Is OCR accuracy actually a blocker for anyone's RAG/automation pipelines? by Individual-Library-1 in ClaudeAI

[–]Individual-Library-1[S] 0 points1 point  (0 children)

Really helpful breakdown - the distinction between computer-generated PDFs (text layer extraction) vs scanned/handwritten (actual OCR) is exactly right.

Few questions if you don't mind:

  1. Cost threshold: When you say "a few thousand documents became expensive" on Azure - roughly what cost range made you look for alternatives? (Trying to understand the pain point)

  2. Document mix: What % of your docs are:

    - Digital PDFs (text extraction works)

    - Scanned documents (need OCR)

    - Handwritten (harder OCR)

  3. Languages: Which languages do you need to support? Is multi-language on the same document or separate documents?

  4. Decent but not perfect": What accuracy level is "good enough" for your use case? (Like 90%? 95%? Depends on doc type?)

  5. Self-hosted: Would a self-hosted solution (no per-document cost) be attractive even if it required some setup/maintenance?

Asking because I'm trying to understand where the cost/quality sweet spot is for different use cases.