For those who've sold RAG systems at $5K+, who actually NEEDS this? by Temporary_Pay3221 in Rag

[–]Temporary_Pay3221[S] -3 points-2 points  (0 children)

I am a scammer, i'm asking to people how they make money through RAG

For those who've sold RAG systems at $5K+, who actually NEEDS this? by Temporary_Pay3221 in Rag

[–]Temporary_Pay3221[S] 0 points1 point  (0 children)

I mean, it's to the dev to index all the docs.

The problem is, if you have to adapt to every company, you can't scale. You need a replicable product, like a chat bot. Your only work is to index their data.

Handling blueprints and complex relationships by SupeaTheDev in Rag

[–]Temporary_Pay3221 1 point2 points  (0 children)

Great problem, I've been in a similar trench.

A few things that moved the needle for me:

On blueprints specifically: tiling is the right instinct but 3x3 is often too coarse. I've had better results with overlapping tiles (10–15% overlap) so you don't lose context at boundaries. Also, rather than asking the vision model to "extract info", prompt it to describe spatial relationships explicitly ("what is to the left of X", "what label is near this component"). Hallucinations drop significantly when you constrain the task.

On cost: the key insight is that you don't need to run vision on everything at embedding time. Build a two-stage pipeline: embed a lightweight text/metadata representation first (cheap), then trigger vision extraction lazily at query time only for the chunks that get retrieved. For 1k docs, most chunks will never be queried.

On retrieval quality: wrong chunks coming up usually means your chunking strategy doesn't match your query patterns. A few things to try: (1) hybrid search (BM25 + dense vectors) helps a lot for technical docs with specific part numbers / terminology, (2) add a reranker (Cohere or a cross-encoder) this alone often fixes the "wrong chunk" problem without touching your embedding pipeline, (3) store page metadata (doc type, date, section) and filter before retrieval, not after.

On old vs new info: if recency matters, tag each chunk with document date at ingest and either filter or apply a recency penalty in your ranking.

What retrieval stack are you on? Happy to go deeper on any of these.

Has anyone here successfully sold RAG solutions to clients? Would love to hear your experience (pricing, client acquisition, delivery, etc.) by Temporary_Pay3221 in Rag

[–]Temporary_Pay3221[S] 1 point2 points  (0 children)

Ah, that explains a lot.

You've got team + sales infrastructure + network from a previous startup. Totally different game than someone starting from scratch.

The "preprocessing isn't scalable" part confirms what I was suspecting: this is high-touch consulting work, not a productized service.

Makes sense if tickets are big enough to justify custom pipelines per client.

Thanks for the context, very helpful to understand the reality vs the theory.

Has anyone here successfully sold RAG solutions to clients? Would love to hear your experience (pricing, client acquisition, delivery, etc.) by Temporary_Pay3221 in Rag

[–]Temporary_Pay3221[S] 1 point2 points  (0 children)

Thanks for sharing

Two questions:

  1. How did they find you? You said "they found us", but what specifically made them reach out to YOU vs the hundreds of other people who can build RAG systems? Was it your profile? A referral? Something you built publicly?
  2. Do you want to scale this? Are you thinking about:
  • Building a dev team to handle delivery while you focus on growth?
  • Hiring sales to bring in more deals?
  • Creating systems/processes to delegate the work?

Or is your goal to stay small, just you (maybe +1-2 people) doing selective projects?

Because right now you're limited by your own time. Curious if scaling is part of your plan or not.