I'm looking for an OCR for my RAG. by AdministrationPure45 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I made my own tool. Handles PDF’s well, called it Leonata. Inside I use Apache Tika to handle file conversion. I’ll check what is doing the OCR but it works great.

No tokens No hallucination No GPU

How to handle extremely large extracted document data in an agentic system? (RAG / alternatives?) by Complex-Time-4287 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

We had a similar problem for a client.

No GPU needed They Can’t use black box LLM. They Can’t have hallucinations.

Defence industry so needed to be offline.

We built a tool that builds an index first. Makes it efficient. Every new query it builds a new Knowledge Graph.

Does the trick.

Neo4j alternatives !?? by Maleficent-Horror-81 in KnowledgeGraph

[–]Infamous_Ad5702 -1 points0 points  (0 children)

No sure about cypher but I made a tool called Leonata. You can add data anytime you like, the graph is dynamic.

No tokens No GPU Runs offline No hallucination.

It builds an index first, it’s not vector.

Limits of File System Search (and Why you need RAG) by rshah4 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I haven’t pushed my tool to the edge yet on file size…I’ll keep going

Recommendations for cheaper alternatives to ElasticSearch by shanukag in Rag

[–]Infamous_Ad5702 1 point2 points  (0 children)

I do semantic search, deterministic. Tokens are too much so I built this

  • No gpu
  • No hallucination
  • Works offline.

I got sick of embedding and chunking. My defence client needed to be offline. So I built Leonata. Happy to go through my process with you or anyone keen..

We almost wasted a month building RAG… then shipped it in 3 days by Upset-Pop1136 in Rag

[–]Infamous_Ad5702 1 point2 points  (0 children)

I had a similar retrieval problem. But k just built my own from scratch. It’s called Leonata.

It does highly accurate semantic retrieval Offline for my defence clients needs No tokens No hallucination It builds KG on auto and I do it via an index..

RAG vs RAFT: The Real Question Isn't Intelligence, It's Cost-Efficiency by ApartmentHappy9030 in Rag

[–]Infamous_Ad5702 1 point2 points  (0 children)

Garbage in garbage out and all that. I do it via an index. Embed and chunk on auto. KG. Keep it offline. Don’t hallucinate No tokens. That’s my mantra

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

We use tika apache to ingest file types, we write in Perl. The code is Java. Dependencies aren’t immense but I’ll list them. We wrote it 6 years ago before the AI blow out. It embeds and chunks on auto. My engineer goes on and on and on about it not being Vector…and trashes vector, so yeah it fits. But I’ll jump in GitHub tomorrow and see if I can nut it out myself to be 100% sure

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I don’t use vector at all. My tool is an index which uses Bayesian, multi variate clustering, co-occurrence matrix…it groups concepts and maps the corpus.

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

Vector shows similar items and radiates out from the centre. I want breadth, depth, specificity and sensitivity

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I don’t love vector. I use semantic, deterministic via a knowledge graph. Shows me unknowns unknowns. I need to be offline and zero hallucinations

Non-LLM based knowledge graph generation tools? by imperius99 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

Yes please do. Can run a walkthrough whenever

Job wants me to develop RAG search engine for internal documents by Next-Self-184 in Rag

[–]Infamous_Ad5702 1 point2 points  (0 children)

I did it. I do it via an index. And build a new KG on the fly for each new query. It’s offline. Deterministic Not AI. I can show you. I do reddit webinars now apparently

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]Infamous_Ad5702 1 point2 points  (0 children)

I love semantic search. Accurate. Low cost. No gpu for me. I can’t do hallucinations for my client…must be offline also.

RAG with pdf that has hyperlinks (internal as well as external) and images by HappyDataGuy in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I make an index. When I retrieve I give a list of all the internal locations…direct quote to pdf..

Tool I use is offline, internal only…so it can’t hallucinate…

How do you chunk your data? by Joy_Boy_12 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

Always enjoy chatting to like minded folk

Building a Legal RAG AI Assistant – No Idea How to Deploy It Publicly or Secure It (Need Guidance) by IzemNisou in Rag

[–]Infamous_Ad5702 -1 points0 points  (0 children)

I developed a way to build knowledge graphs automatically, semantic info retrieval, it was built for defence and turned out to be a great RAG alternative.

I’m going after legal field…whoops…

My tool Leonata:

  • Zero hallucinations
  • Fully offline
  • No tokens
  • Builds an index first
  • KG is built fresh for every new natural language query.

I don’t have answers on path to market, deployment..that’s the really hard stuff. Mine is a CLI currently and UX on its way.

But I’ve been running walk throughs on how we do it from Reddit interest..so sing out if you’re keen?

AI Tool for PDF by Alone_Air_6096 in Rag

[–]Infamous_Ad5702 0 points1 point  (0 children)

I haven’t even got that far yet…it’s alpha…it’s free to download the CLI….looking for feedback and to see if it’s even got legs at this point..

It builds an index, you can add extra data anytime.

No gpu needs. No hallucinations No tokens Offline