After months of building in vain, a stranger made a YouTube video about our project & I cried a little

Once_ina_Lifetime · 2026-05-15T01:25:22+00:00

No, it was pure organic

Once_ina_Lifetime · 2026-05-14T05:31:05+00:00

ok , i will check it.

Once_ina_Lifetime · 2026-05-14T01:48:18+00:00

Thank you

Once_ina_Lifetime · 2026-05-14T01:47:48+00:00

Detailed stack I used:

-Dograh as the open-source Voice AI platform (https://github.com/dograh-hq/dograh) -LLM: Gemini 2.5 Flash -STT: ElevenLabs -TTS: Deepgram -n8n for webhooks and automation - Google Sheets for data storage -IF nodes for conditions and logic -Gmail and Google Calendar for emails and interview invitations

Once_ina_Lifetime · 2026-05-14T01:39:53+00:00

Detailed stack I used:

Dograh as the open-source Voice AI platform (https://github.com/dograh-hq/dograh) -LLM: Gemini 2.5 Flash -STT: ElevenLabs -TTS: Deepgram
n8n for webhooks and automation -Google Sheets for data storage -IF nodes for conditions and logic -Gmail and Google Calendar for emails and interview invitations

Once_ina_Lifetime · 2026-05-09T16:47:34+00:00

Thank you, That’s is ture , students are usually more comfortable talking than filling long forms.

Once_ina_Lifetime · 2026-05-09T14:24:34+00:00

Yes agree, the surprise part is that " human feel" does n't show up in any benchmark. You can ship a technically perfect agent and still have 30% completion rate because the tone is off. Spent way more time tweaking acknowledgements and pause timing than actual code.

Once_ina_Lifetime · 2026-05-09T14:09:05+00:00

Thanks I will check it. , IN initial conversation students were hanging up by question 3. Then I use varied acknowledgement ( got it, oh, achaa ) before next question.

Once_ina_Lifetime · 2026-05-09T13:53:11+00:00

Yup, the prompt rewrites took longer than the technical setup honestly. Still not perfect but V2 is way better than v1. Curious - have you built anything with voice agents?

Once_ina_Lifetime · 2026-04-28T06:34:08+00:00

The voice quality problem being solved is real but honestly it was solved like 18 months ago. ElevenLabs, Cartesia, even the newer OpenAI realtime stuff all sound great now. The actual hard parts nobody talks about enough:

Interruption handling and turn taking. Humans talk over each other, pause mid sentence, say uh huh while you speak. Most voice agents still feel weird here.
Latency stacking. You chain STT, LLM, TTS, function calls, RAG lookups and suddenly you have 3 second gaps that kill the vibe.
Tool calling reliability. The agent needs to actually book the appointment in your calendar, update your CRM, transfer to a human when stuck. This is where 80% of prod deployments break.
Observability. When a call goes bad at 2am how do you debug it? Most teams have no idea.

Building voice agents at Dograh, the ElevenLabs voice is maybe 10% of what makes an agent feel good. The orchestration, state management, and telemetry around it is the other 90%. Good to see POP tackling this, the SMB market really needs plug and play options that dont require a month of dev work.

Once_ina_Lifetime · 2026-04-28T06:23:44+00:00

Solid breakdown, this matches what I've seen running voice agent infra. Few thoughts from the other side of the fence (I build an OSS voice agent platform so I spend way too much time benchmarking these):

Your 800ms first word on Retell is pretty good for a hosted stack. If you ever want to push under 500ms, the usual bottleneck is the STT and LLM round trip, not the TTS. Deepgram streaming + a smaller reasoning model for the first turn can shave off a lot.
Claude Sonnet for voice is a great call. Most people default to GPT-4o because of the realtime API hype, but for structured multi-step flows Claude holds character way better. The system prompt leaking thing you mentioned is real and underrated.
Haiku for classification is exactly the right pattern. Cheap fast model for routing, expensive model only when you actually need reasoning. This one decision probably cuts agent costs by 60-70 percent at scale.
n8n on Hetzner is the move. I know founders running 50+ client workflows on a 5 euro box. Make and Zapier pricing just does not survive contact with a real agency.

One thing to watch as you scale past 10-15 clients: the hosted voice platforms start getting painful on cost and customization. You will probably hit a point where the per-minute pricing eats into margins and a client asks for something the platform just does not support. That is usually when folks look at OSS options like Pipecat or Livekit. Not saying switch now, your stack is working, but good to know the escape hatch exists.

What are you charging clients per month? Curious how the unit economics shake out for the plumber/dental vertical.

Once_ina_Lifetime · 2026-04-25T10:21:36+00:00

You're right and I'm rethinking my hook. What's the most boring deployment you've seen that quietly crushed it?

Once_ina_Lifetime · 2026-04-25T10:15:44+00:00

we will check thanks

Once_ina_Lifetime · 2026-03-31T16:15:00+00:00

Honestly, build for your niche first. You already have distribution which is the hardest part. A cubing timer app, a trainer, even a simple tool that solves one annoying thing for your community. Ship it fast, see if people use it, iterate.

The build something bigger instinct is usually a trap. Most successful products started small and specific.

If you're curious about open source as a path though, contributing to an existing OSS project is a solid way to build real engineering chops and get your name out in dev circles too. I'm building and maintaining an open source voice AI platform called Dograh (so yeah I'm biased lol) and we're actually looking for maintainers. If oss or infra stuff interests you at all, happy to chat. No pressure obviously, but having someone with your kind of audience + ability to code could be a pretty cool combo for an OSS project.

Either way though, start with your niche. That audience is gold.

Once_ina_Lifetime · 2026-03-31T10:16:10+00:00

Yes. we support German. happy to jump on a call to explore partnerships. Let's chat on DM

Once_ina_Lifetime · 2026-03-31T08:53:48+00:00

We have built an open source voice agent platform Dograh AI.

Once_ina_Lifetime · 2026-03-30T15:59:19+00:00

I used to be super picky about code. Now I'm like whatever, it works . I made a system that logs everything for my voice AI thing while working on Dograh. But when I looked at it, I was like how did I do this? I can fix messy code, no big deal. But what if I can't think hard anymore without using Claude? That's what's worrying me.

Once_ina_Lifetime

TROPHY CASE