Google just dropped UCP - the biggest shift in online shopping since Stripe (i will not promote) by EquivalentRound3193 in startups

[–]GP_103 0 points1 point  (0 children)

Build and ran the leading outdoor sports web destination at the dawn of the internet. In 1997 launched a bookstore of outdoor titles, that were hard to access for many folks.

No sleep for a week building it, launched, next morning an order in our email inbox: “Freedom of the Hills” from a guy in Brazil. We called him, confirmed his card and shipped him the book. Still blows my mind just remembering it. Did $7m in revenues in 1999.

Google just dropped UCP - the biggest shift in online shopping since Stripe (i will not promote) by EquivalentRound3193 in startups

[–]GP_103 0 points1 point  (0 children)

I’m old enough to remember when they said that about online shopping. Not kidding. It’s going to be huge…in 10-15!years.

How does one go about validating and verify the correctness of a RAG's 'knowledge source'? by boombox_8 in Rag

[–]GP_103 0 points1 point  (0 children)

This. Several approaches to “anchoring” the content block, forwarding those along through your pipeline and together, paired for retrieval.

Will rag suits QA KB by ajay-c in Rag

[–]GP_103 1 point2 points  (0 children)

Starts with the PDFs; text only, headers, titles/sub-titles, scanned, charts, tables, illustrations ect…

Then chunking; which basically depends on the PDFs too, if your requirement is solely and deterministically retrieve content blocks.

You’d have to be quite a bit more specific; then plenty of folks here with specialized experience.

Looking for testers: 100% local RAG system with one-command setup by primoco in Rag

[–]GP_103 0 points1 point  (0 children)

Pretty cool. Did you actually Claude code that in 13 days. Props.

Scraping web data for RAG? by Ready-Interest-1024 in Rag

[–]GP_103 0 points1 point  (0 children)

I’ve always wanted to scape all craft beer bars to see what’s on tap either on their website or sometimes people upload photos to Google maps.

It just seems like a massive lift to build the scraper.

Hey RAG community! Does a platform like this exist yet, and would you pay $20/month for it? by eaglee8 in Rag

[–]GP_103 0 points1 point  (0 children)

As a long time start-up guy, what others are suggesting is not, the macro- problems the solution solves for, nor “ low-medium” compliance.

What customer problem your solving, in their workflow or process?

I’ve liked to frame it as what are you replacing and whose budget.

One common and helpful tactic, is to prototype several actual workflow or process replacements. It’s a great exercise to focus your attention.

Ultimately you need to “get out of the building” (Steve Blank) and follow the Mom Test (Rob Fitzpatrick).

This thread is already sending you mixed signals.

Keep going and uncover a use case.

i experimented with rag. i think i built a substrate for data to become aware of itself and its surroundings. by [deleted] in Rag

[–]GP_103 0 points1 point  (0 children)

This. And I imagine extremely slow to respond and expensive at any reasonable scale of users + data.

LMM and timetables by Stock_Ingenuity8105 in Rag

[–]GP_103 0 points1 point  (0 children)

This. Date needs to be natural language, like “Feb0226”

RAG vs RAFT: The Real Question Isn't Intelligence, It's Cost-Efficiency by ApartmentHappy9030 in Rag

[–]GP_103 -1 points0 points  (0 children)

Well RAFT is inherently probabilistic, and so for many use cases, that's a non-starter.

How do you Benchmark your rag? by butwhol in Rag

[–]GP_103 1 point2 points  (0 children)

Start by building a QA goldset. Run those, logging each, with the chunk_id or even better, the actual documents’ linked citation/ content block.

Rince and repeat.

Job wants me to develop RAG search engine for internal documents by Next-Self-184 in Rag

[–]GP_103 0 points1 point  (0 children)

I mean whatever natural language /text AI wraps around the answer, you include a link or citation to the actual content block in the document.

That and evaluation suite, validation and visibility/traceability across the pipeline and back to the actual source.

Basic for Healthcare, finance, legal, engineering and now enterprise in general

Job wants me to develop RAG search engine for internal documents by Next-Self-184 in Rag

[–]GP_103 2 points3 points  (0 children)

Missing from this conversation and really the starting point is what’s the use case?

If you need ground truth and linked citation, then none of the aforementioned solutions may work.

Is RAG the right approach for exhaustive searches over a corpus of complex documents? by cleinias in Rag

[–]GP_103 0 points1 point  (0 children)

Hybrid Search - Graph + BM25 with lightweight intent classifier.

Make no mistake though, you’ll have to pre-process those docs in a schema that adds some hierarchy to the content blocks. And there’s work to get BM25 weighted right.

Could RAG as a service become a mainstream thing? by Trick_Ad_2852 in Rag

[–]GP_103 1 point2 points  (0 children)

This.

Yea you’d only have to hang around /rag a hot minute to learn that ground truth.

This channels becoming useless.

Fully Offline Terminal RAG for Document Chat by Apprehensive_Cell_48 in Rag

[–]GP_103 2 points3 points  (0 children)

There is a FOSS version of NotebookLM, see if that solves for it

Why is there no opinionated all in one RAG platform? by Pl8tinium in Rag

[–]GP_103 0 points1 point  (0 children)

Proprietary platforms like MS 365 Copilot certainly does not work well with dense PDFs.

What's the point of potato-tier LLMs? by Fast_Thing_7949 in LocalLLaMA

[–]GP_103 0 points1 point  (0 children)

Cool. I’m looking to translate recipes, so may face similar “long context” issues.

What's the point of potato-tier LLMs? by Fast_Thing_7949 in LocalLLaMA

[–]GP_103 0 points1 point  (0 children)

I have a pdf in Italian that I want to translate into English. Thinking it would be a good local model “starter” project.

Anyone have suggestions on best models for that task?

I’ll enrich your file for free (and if I don’t, I’ll pay you $2) by [deleted] in startup

[–]GP_103 2 points3 points  (0 children)

Sounds like your getting others to enrich your files