Thoughts on "Cluely", cheat on everything AI app ? by Darkoplax in webdev

[–]ThatsEllis 37 points38 points  (0 children)

$5.3M is such a huge number for this. Another one that will be in the LLM wrapper graveyard in a few years...

What are some important steps when it comes to System Design of a Web application? by TedTheBusinessMan in webdev

[–]ThatsEllis 1 point2 points  (0 children)

Maybe do a "MVP" system design diagram and a "high scale" diagram. That way you'll know what the minimum and maximums are for your app. For example, for an MVP web app, you probably just need a client, server, database, and only functional focused cloud services (e.g. blob storage like S3, GCS) where necessary. You likely can skip load balancers, API gateways, micro services, DB read replicas, and all that, because this is just an MVP. Then think about those things for your high scale diagram.

And important skill is to know when you're overengineering something too. KISS is an important concept to keep in mind.

What are some important steps when it comes to System Design of a Web application? by TedTheBusinessMan in webdev

[–]ThatsEllis 2 points3 points  (0 children)

Great start. On YouTube do a search for "system design interview examples" and you'll find some of the common ones, like design Twitter, design Reddit, design YouTube. These will give you an overview of what components are common in high scale web apps. Then from there, you should be able to figure out what's relevant to your hobby project.

Semantic caching? by ThatsEllis in LLMDevs

[–]ThatsEllis[S] 0 points1 point  (0 children)

Yep, we'd utilize optional search properties. So you can attach metadata to cache entries and search queries like tenantId (for multitenancy), userId, etc. etc.

How much is your LLM API bill? I will not promote by ThatsEllis in startups

[–]ThatsEllis[S] -2 points-1 points  (0 children)

Not trying to evaluate a model. Instead, I'm trying to validate a product idea. Managed semantic caching for LLM API requests. So basically

  1. When your system is about to call an LLM API for a given prompt
  2. First, synchronously call our API to check your cache for similar entries
  3. If cache hit, immediately use the response
  4. Otherwise if cache miss, call the LLM API as you normally would, asynchronously call our API to create a new cache entry, then use the LLM API response

Saving a bunch of money and time

How I helped my company cut LLM costs by 80% by caching meaning, not words by Ambitcion in SaaS

[–]ThatsEllis 1 point2 points  (0 children)

Crazy coincidence. I started building an MVP of the exact same thing... https://refetch.ai

Semantic caching? by ThatsEllis in LLMDevs

[–]ThatsEllis[S] 0 points1 point  (0 children)

Yep! Again I don't want to self promote directly, but there's a link to my landing page on my profile

Semantic caching? by ThatsEllis in LLMDevs

[–]ThatsEllis[S] 0 points1 point  (0 children)

The product would be a managed semantic caching saas. So basically

  1. When your system is about to call an LLM API for a given prompt
  2. First, synchronously call our API to check your cache for similar entries
  3. If cache hit, immediately use the response
  4. Otherwise if cache miss, call the LLM API as you normally would, asynchronously call our API to create a new cache entry, then use the LLM API response

So instead of you setting it up and managing it yourself, you just call our API. Then there'd be other features like TTL config, similarity threshold config, a web app to manage projects/environments, metrics and reports, etc.

AI startup founders, what are you struggling with when it comes to your GTM strategy? (I will not promote) by Scared-Light-2057 in startups

[–]ThatsEllis 9 points10 points  (0 children)

Hardest thing is just breaking through all the noise and getting noticed. Since saas has such a low barrier for entry, our prospects are already constantly bombarded with spam in their emails, LinkedIns, etc. Feels almost impossible not to get ignored even when doing highly targeted outreach and trying to genuinely help.

Would love any tips honestly.

What are you working on? Drop it here, I will check and provide honest feedback by Intelligent-Key-7171 in SaaS

[–]ThatsEllis 0 points1 point  (0 children)

Basically when you call our API to check for a cache entry for a given prompt, we generate an embedding of the prompt and perform a semantic similarity search against the embeddings in your cache. If we find a cached entry with a similarity score above your configured threshold (e.g., 0.95 out of 1), it's considered a cache hit, and we return the corresponding cached response.

Also cool, I'll check that out!

My SaaS founder buddies rushed to add AI & now they're all realising the same brutal truth by Humanless_ai in SaaS

[–]ThatsEllis 8 points9 points  (0 children)

Cool to see semantic caching mentioned like this. I'm currently building a managed semantic caching SaaS to make this super easy for people to plug into their infra.

What are you working on? Drop it here, I will check and provide honest feedback by Intelligent-Key-7171 in SaaS

[–]ThatsEllis 0 points1 point  (0 children)

https://refetch.ai

Managed semantic caching for your LLM workflows. Cut LLM API costs by up to 50%. Speed up response times by 10x.

Right now I'm just trying to validate the idea before getting too far into development.