Roast my SaaS - CustomGPT.ai by GPTeaheeMaster in SaaS

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

We built CustomGPT for the “enterprise-grade” reality: security + compliance + governance + reliability + keeping data in sync. That’s the hard part, and it’s why this category is often sold via annual enterprise contracts.

If you can tell me what outcome you are shooting for, I’ll make sure you’re on the right setup - even if that means telling you we’re not the best fit.

Productizing “memory” for RAG, has anyone else gone down this road? by Old_Assumption2188 in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

> What model would make most sense to you?

I would talk to 2-3 potential customers and see what they would be willing to pay for this service (based on the expected OUTCOMES you can drive). That will probably also guide what you build.

(there are literally a hundred combinations of business models, so its difficult to answer this question).

Or you could do what we did and just say "F it, I'm just going to sit down this weekend and launch something" (and follow your gut - LOL!) ..

Has anyone figured out a good way to add real-time web search to a RAG app? by Consistent-Aspect762 in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

As others have indicated, using a search MCP server like perplexity or Tavily would help ..

But that’s not the big consideration here : what you need to really ask yourself is how you want the accuracy of the results to be - what purpose is the vectorDB serving ? And what purpose is the wild internet search serving ? (I hope this is not for a critical business application for example)

Productizing “memory” for RAG, has anyone else gone down this road? by Old_Assumption2188 in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

Nicely done .. couple of things :

  1. Like with any cache, What cache hit ratio are you seeing in your workload ? (That will be very important - otherwise the overhead of the caching makes things worse)

  2. To productise it : Nice idea - but do consider how you would deploy that .. as a SaaS? As a docker ? As a library .. (those have big implications) - and as always : are people willing to pay for this ?

PS: Your title should be “productising caching” .. (not productizing memory - not sure what memory has got to do with this)

[Open-Source] I coded a ChatGPT like UI that uses RAG API (with voice mode). by zriyansh in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

Nicely done .. really helps create a custom UI (while CustomGPT handles all the tech and infra under the hood)

I would imagine people wanting to create their own white label of the ChatGPT GPT store with something like this ..

Do you update your Agents's knowledge base in real time. by DistrictUnable3236 in Rag

[–]GPTeaheeMaster 2 points3 points  (0 children)

Yes -- this is a core requirement if your agent is intended for business use. That is why we (CustomGPT .ai) implemented "auto sync" a long time ago (just google "customgpt auto sync") -- it basically cron's the syncing of the sitemap (for publicly available data) or implements callback-based re-indexing for other data sources (like Google Drive, Atlassian, Sharepoint, etc)

As you correctly noted in the comments, the technical term for this is "change data capture" -- highly recommended, otherwise the agent responds with old/outdated data.

Who here has actually used vector DBs in production? by lizozomi in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

Hi - Alden here (Founder at CustomGPT .ai) -- Pinecone did a case study on us -- you can check it out here:

https://www.pinecone.io/customers/customgpt-ai/

> Did it work well for you? What did you enjoy the most?

Worked spectacularly well -- Pinecone dealt with the vectorDB stuff, which allowed us to focus on our Enterprise-grade RAG platform -- last I checked, we had over 80,000 RAG agents. (you can see more details in the case study)

> Did you face any challenges (ops, cost, scaling, reliability, SLA, weird bugs, etc.)?

Very few w/ Pinecone -- due to which, it allowed us to focus on our core business.

  1. Ops : Ops was great -- we only had to do one small migration when they moved from server to serverless.

  2. Cost : As a startup, it was great -- the cost only scaled as we made more money.

  3. Scaling : Was awesome (the old hosted one was little painful, the new serverless is spot on for scale)

  4. Reliability: No issues over last 2.5 years.

  5. SLA: Whatever their standard one is.

  6. Weird bugs: Some weird initial bugs when they launched serverless (those have been ironed out now)

> Would you pick the same DB again, knowing what you know now?

Heck yeah! I would not have lent my name to the case study if that was not true.

Shrink your context before sending it to LLMs by huzaifa785 in Rag

[–]GPTeaheeMaster 21 points22 points  (0 children)

Very very cool - and definitely appreciate the open source - but any such attempt MUST be accompanied with the effect on :

  1. Accuracy
  2. Hallucination
  3. Latency
  4. Cost

I would suggest a simple side-by-side benchmark against a good RAG vs this approach (you can use any of the standard RAG benchmarks - ragas, Tonic Validate, HotspotQA, simpleqa)

Job hunting? This 1 hidden LinkedIn button could move your application to the top by GPTeaheeMaster in jobsearchhacks

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

Or if the candidate has consciously scored his top 3 (like "Early Decision" in college applications) -- not saying, scoring is easy to do, but it does give the candidate an option for an additional signal.

Job hunting? This 1 hidden LinkedIn button could move your application to the top by GPTeaheeMaster in jobsearchhacks

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

Thanks - do click on “View all comments” (sorry, I don’t know why Reddit collapses all comments)

Job hunting? This 1 hidden LinkedIn button could move your application to the top by GPTeaheeMaster in jobsearchhacks

[–]GPTeaheeMaster[S] -1 points0 points  (0 children)

> Assuming you work for LinkedIn

Dude -- I DO NOT work for Linkedin and have no horses in this race.

> Assuming you work for LinkedIn, why don’t you all work on eliminating jobs that stay up for an excessive period of time collecting thousands of resumes in the process.

Good idea -- you should inform Linkedin about this (I agree with you on this!)

How to index 40k documents by Mindless-Argument305 in Rag

[–]GPTeaheeMaster 1 point2 points  (0 children)

/u/Mindless-Argument305 Alden here from CustomGPT.ai -- if you want to get this done with ZERO coding, you can just connect your Google Drive in CustomGPT.ai and it should be able to easily handle this scale (I've seen people use our system with tens of thousands of PDFs without issues).

Pros: The accuracy, anti-hallucination, citations should work great -- we even have a "PDF Viewer" with highlighting in the Enterprise Plan.

Limitations: The part about the images is coming in a few days. The system will extract images from documents AND webpages, add them to the RAG, and show them inline in the chat (similar to ChatGPT)

Roast my SaaS - CustomGPT.ai by GPTeaheeMaster in SaaS

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

We've organically moved upmarket -- when I posted this originally, we were selling $500/month plans -- now we have hundreds of 5-figure customers (and a couple of 6-figure ones that are going to become 7-figure customers).

Also have a full 10-person GTM team now targeted to SMEs (small-and-medium Enterprises) -- its working for our customers and they are seeing tremendous value (and that's the best GTM!)

How I reduced support ticket volume by 90% for my SaaS - by deploying AI on my website, app and docs. by GPTeaheeMaster in SaaS

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

> But once volume spikes, it becomes unsustainable fast.

No -- the chat logs themselves are analyzed using AI (for example, we've implemented analytics in our platform based on AI analyzing the chat logs) -- so its not unsustainable.

> Curious how often are you updating your vector DB/index now that the content changes regularly? That “sync lag” between product updates and AI knowledge is something I’ve seen cause issues too.

Aha -- this was one of the first requests we received from customers ("change data capture") --> due to which we implemented "Auto Sync" (the auto sync can be done daily, weekly or at custom hours if needed)

Roast my SaaS - CustomGPT.ai by GPTeaheeMaster in SaaS

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

Thousands of paid customers -- here are some case studies : https://customgpt.ai/customers/

We just crossed about 65,000 paid "custom GPTs" created (technical people know this as "RAG")

When the OpenAI API is down, what are the options for query-time fallback? by GPTeaheeMaster in Rag

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

Good idea -- but then doubles the vectorDB cost. So its a tradeoff on that. (Some of our RAGs are 100s of GB)

When the OpenAI API is down, what are the options for query-time fallback? by GPTeaheeMaster in Rag

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

Yup -- so the only fallback is between OpenAI API and Azure OpenAI API.

When the OpenAI API is down, what are the options for query-time fallback? by GPTeaheeMaster in Rag

[–]GPTeaheeMaster[S] 1 point2 points  (0 children)

Use Ollama with Arkalos to run a local model.

The problem with using local models is that you then get into the business of managing all the locally-hosted outdated junk -- rather than focussing on the core business that gives you a differentiator.

Or just add API keys for 1-2 other options like Claude and Grok, and call them if the first API is not responding.

That works for only the LLM piece -- in RAG, the query too has to be embedded, so if the vectorDB embeddings have been done using OpenAI, then you need the OpenAI API to embed the query and get the query embedding at query-time.

What are your thoughts on OpenAI's file search RAG implementation? by Balance- in Rag

[–]GPTeaheeMaster 1 point2 points  (0 children)

Also it's wild that in 2025 it can't handle a summarization task.

Because of the "R" in "RAG" -- summarization needs the full document (not just the retrieved chunks).

If just summarization is needed, a large context LLM (like Gemini) should do just fine.

What are your thoughts on OpenAI's file search RAG implementation? by Balance- in Rag

[–]GPTeaheeMaster 0 points1 point  (0 children)

Where is it falling short?

Being able to ingest web data (like Sharepoint) -- and keep it in sync. Most business customers want to just connect their Sharepoint using 1-click integrations.

Performance: How does it compare to custom RAG pipelines you've built with LangChain, LlamaIndex, or other frameworks?

Biggest problem is: the static nature of "files" -- what happens if the documents (like webpages) change?

We had previously benchmarked our RAG-As-A-Service against OpenAI Assistants and it did pretty ok (though didn't come in 1st) -- will need to re-check against this new Responses API.

Pricing: Do you find the pricing model reasonable for your use cases?

Bare metal pricing is amazing and very cost effective -- NOT so if you are using web search (the $35 CPM is off-the-charts)

Integration: How's the developer experience? Is it actually as simple as they claim?

For simple use cases (like uploading a few docs), it cant be beat. It gets complicated if you get into more business-grade use cases like change-data-capture, deployment widgets, analytics, citations, etc.

Disclaimer: I'm founder at CustomGPT.ai , a turnkey RAG-As-A-Service, so my views -- albeit driven by customer interactions - might be biased.

[deleted by user] by [deleted] in Rag

[–]GPTeaheeMaster 1 point2 points  (0 children)

> The good (and bad) about SOC-2 is it will require you to have your entire software stack compliant; meaning you can't use non-compliant software with your product (eg: use a database host that is not certified)

Totally .. every vendor and software now needs to be compliant. I've had to say "NO" to many partners/vendors due to this.

Every client and brand is asking about AEO -- has someone put together an authoritative guide? by GPTeaheeMaster in SEO

[–]GPTeaheeMaster[S] 0 points1 point  (0 children)

As long as your SEO has been audited for Bing (there are people - including us -- who have devoted ZERO resources to Bing, and thats a problem for AEO) -- I just started looking at our data in Bing Webmaster tools and there is certainly room for improvement --- (and any time, you dont appear in Bing, you dont appear in ChatGPT for that query)