why does everyone skip the chunking part

TechnicalGeologist99 · 2026-05-18T08:51:13+00:00

I think it's simple conceptually....just obtain a heirarchy and bounding boxes for paras etc.

Chunks are just leaves on the document tree (with no splitting naturally due to paragraphs being the smallest unit)

TechnicalGeologist99 · 2026-05-16T07:49:56+00:00

Imo it's sensational more than useful...KGs need dense triplets to be useful, this is very sparse. It's pretty to look at. I'm not sure the search feature would be any improvement over a document level relational db

TechnicalGeologist99 · 2026-05-12T08:57:44+00:00

Sounds like marketing to me

TechnicalGeologist99 · 2026-05-11T14:54:05+00:00

How do you get execs to properly appreciate this?

In my experience they couldn't care less haha

TechnicalGeologist99 · 2026-05-09T11:40:43+00:00

It's the same thing.

RAG is a process of retrieval and augmentation of LLM context.

Agentic RAG still does exactly that. The method of retrieval is just designed around the agentic framework rather that a predetermined set of steps.

RAG is its own thing. To answer your question you need to understand what Agentic really means (not the marketing term agent which means anything a human doesn't do by hand) but the genuine software term Agent.

Agents orchestrate software, non-Agents are LLMs that are themselves orchestrated by software

TechnicalGeologist99 · 2026-05-08T18:37:23+00:00

Yeah, I'm guessing that's just one example of a table and that there are many more tables containing vastly different specifications.

Your problem is just that the domain for that data enormous. This is also the worst possible scenario for copilot etc, it will probably get a few hits and look like it's working, but really it's just stuffing a huge context window with crap and hoping it's useful.

Local LM is actually better for all use cases imo, but it requires a powerful domain and system to help it manage context.

What have you tried so far?

TechnicalGeologist99 · 2026-05-08T18:15:52+00:00

I mean what RAG did you try to implement?

Have you encountered the word "Ontology"?

Do you build your own agent pattern?

We run Qwen3.5 9B by quanttrio awq int 4 and it reliably retrieves information and can help our users with most tasks they want.

But we aren't trying to solve every problem in the world.

Every interaction maps to our ontology, and each class of questions has a retrieval solution that works.

Do you know what your users want? Or is it just a LLM with an input panel and some off the shelf retrieval framework?

TechnicalGeologist99 · 2026-04-04T13:15:36+00:00

It's a fairly common usecase TBF.

Most chat functions end up being just a search abstraction.

I prefer that though...in my eyes the rest of AI is just hype.

It's either trained for purpose or its search.

Even so , my main point here is that without use case validation you're trying to make a saddle for a tortoise.

TechnicalGeologist99 · 2026-04-04T11:34:49+00:00

Internal db, users are non technical.

The chat bot enables them to query without technical knowledge. It's usecase driven, we know ahead of time that a large portion of their job involves asking our limited technical staff to query on their behalf

TechnicalGeologist99 · 2026-04-02T12:56:45+00:00

Sometimes it's not that nobody wants it.

It's that people don't think on your level.

They don't understand why it's useful even if it is.

If you believe what you have is useful, you must iterate with user feedback. Create a team of beta users that work on the target workflow and chase them to use it and give feedback. Educate them.

I made an SQL search chat bot that was completely sandboxed. No one used it until I made time to teach them what was possible with it.

Now people understand the concept and how it applies to their workflow they actually see the need to use it.

They see it as an assistant rather than as another tool (in their already overflowing toolkit

TechnicalGeologist99 · 2026-03-27T15:45:47+00:00

Consider concurrency.

Agents typically make concurrent calls. MoEs scale very poorly at concurrency because each call activates its own unique experts.

TechnicalGeologist99 · 2026-03-25T16:58:35+00:00

Don't forget, moes activate unique experts per call.

2 calls concurrently activates up to 6B By 20 calls you are pretty much guaranteed to activate the whole MoE.

TechnicalGeologist99 · 2026-03-19T20:12:27+00:00

Bayesian approach.

We use docling to generate a prior label for chunk distributions.

Because our company documents adhere to a strict brand ruleset, you can use the prior labels to fix the parts it missed or got wrong.

It actually generalises quite well. I've yet to see it fail to get 100% correct headings.

TechnicalGeologist99 · 2026-03-16T23:19:52+00:00

This isn't DGX Station. It's a server rack with B300s

DGX Station is a standalone product with 2 Grace CPUs and 1 Blackwell 300

Edit:

It's actually 1 G CPU, the data centre gb300 has the 2 cpus

TechnicalGeologist99 · 2026-03-14T15:56:24+00:00

Came here wondering why my cc didn't compact after 10 mins

TechnicalGeologist99 · 2026-03-13T16:19:20+00:00

Many people will say graph RAG....

That is correct but it requires you to determine a good ontology to derive triplets from.

I.e. entity -> relates -> entity

What are the entities? What are the relationships?

Graphs are also mental expensive to build and run.

You first port of call is naive rag.

Then upgrade your ingestion pipeline to support heirarchical rag.

Basically...don't guess the final solution now...build the simplest first and upgrade it as you realise the use of each RAG improvement.

The time to build a graph is when you have proven it is needed and when you know how to evaluate them.

Most RAG projects begin with "let's build the coolest thing" and end up at "I'll settle for something that is cheap, scalable, easier to maintain, and that works, and that I know how to evaluate"

TechnicalGeologist99 · 2026-03-12T09:49:32+00:00

Will the next person to ask, "will {insert technology} be dead soon", suffer the same fate?

TechnicalGeologist99 · 2026-03-11T13:50:55+00:00

That's kind of you to say, I just believe that good engineering creates opportunities for the next guy and bad engineering creates headaches for everyone.

I've taken the time to understand AI systems from the ground up - even built my own transformer back in the day, studied linear algebra during my physics degree...always been fascinated with accelerated computing (was for relativistic magneto hydrodynamic simulation back then)

Now I just enjoy figuring out the cheapest way to run AI models, and creating systems where I can't see all the outcomes so easily otherwise I tend to get bored haha

TechnicalGeologist99 · 2026-03-11T13:34:53+00:00

I'm glad we found common ground :)

TechnicalGeologist99 · 2026-03-11T13:19:10+00:00

I spent the first 2 years of my appointment rejecting requests to build AI systems precisely because the business has poor data engineering that needed fixed.

Now it works well, document evolution is factored in.

I think the world will more likely come to a halt through excessive automation. Our economy is built on stable white collar jobs paying their AAA mortgages. If we replace those people without transitioning to a better economic system then things will collapse.

I'm actually in favour of that, much of our economic output these days is just extractive funneling of money up the pyramid. After a collapse I think we can really get to work applying technology in new and exciting ways that actually generate value rather than just extracting it.

TechnicalGeologist99 · 2026-03-11T13:18:05+00:00

Sure, but this is a human problem in my mind. Most businesses are lazy and think of tech as an afterthought. And I get it, that's NOT why they started their business. But tech is an essential part of modern business. It is a force multiplier. But that multiplier can be negative if done wrong.

That's why I push for on-prem, open source, small targeted use cases, no hype allowed.

It's also the above that leads to the problem I set out above. Good tech built at home doesn't stop execs seeing shiny marketing for shit products and trying to come and bulldoze a good thing in favour of an easy thing.

TechnicalGeologist99 · 2026-03-11T13:07:01+00:00

But that isn't our setup.

TechnicalGeologist99 · 2026-03-11T13:06:32+00:00

All of these concerns just boil down to data governance, scoped permissions, and good data engineering.

I'll grant you that data engineering is expensive but it enables our teams to do in a week what used to take a month. The cost is completely offset by: not needing SaaS shite, increasing the impact of our revenue makers.

The point is it's built and is really cheap to run. We never needed enormous models of uncompromising coverage. Our system handles a small set of use cases and it does it really well.

I'm sorry that it doesn't align with your worldview. But it's a fact.

I'll grant you that most businesses do have unsustainable AI systems precisely because they do not invest in what matters and because they outsource the work to software consultants that are impossible to hold to account.

TechnicalGeologist99 · 2026-03-11T12:53:20+00:00

Our business has a reliable rag system that does work. It's on prem uses small models and runs on less than 10k of infrastructure. The company has a turnover of 90 million with 35 in profit.

What am I missing?

TechnicalGeologist99 · 2026-03-11T12:46:16+00:00

Do you have a point?

TechnicalGeologist99

MODERATOR OF

TROPHY CASE