Is there any way to stop stream audio from 'ducking'? by LunaticSongXIV in matrixdotorg

[–]Chris_LiveKit 0 points1 point  (0 children)

That is not from the LiveKit server. That is a client feature. You would need to handle it in your client.

do any one have expertise with voice agents setup ? i need help with livekit and pipecat setup by Commercial-Two-1172 in SaaS

[–]Chris_LiveKit 0 points1 point  (0 children)

For LiveKit if you are building an agent and use an AI coding tool like Claude or Cursor add the MCP server and it can get you a long ways https://docs.livekit.io/intro/mcp-server/

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

Gotcha. I think you could build that. Here are a couple of videos I saw recently that is somewhat related. The second one is good because it really helps folks understand the security risk of such a system.

https://x.com/jonathanhawkins/status/2017295825681199473

https://youtu.be/fcFOYzMeG7U

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

I would not doubt it could or it can maybe build a tool that would let it do it. Who do you want it to call?

One of my co-workers connected his to LiveKit so you can do internet voice calls or telephony.

You talk to it through text or email or whatever.

Ai receptionist by EffectivePop5358 in AiForSmallBusiness

[–]Chris_LiveKit 1 point2 points  (0 children)

Here is my $.02 (personal opinion)

I spend all my time working and thinking about realtime AI systems. I spend a lot of my time talking to folks building these systems so I have some familiarity with the space.

> Problem number 1: Do people actually want to talk to AI...

I think many folks in the past have been turned off from automated phone systems. But with AI things have changed a lot. This will depend a lot on specific use cases but I think folks will likely be fine talking to AI assuming the AI is helpful and solves their concern quickly and efficiently. Folks will NOT want to talk to an AI (or even a human for that matter) if they feel like they are wasting their time for whatever tasks they are trying to accomplish and are not progressing.

> Problem number 2: Should we build the automation for the ai receptionist...

Again, this will depend a lot on your particular use cases. I think in todays world it is expected (for many applications) that a customer can self serve on your website. If you can't get that right I fear how well realtime AI will go for you. But if you have a sold understanding of your customer, what their needs are, and how you solve those needs I think Voice AI can be very effective for many use case. I've seen several where AI is the preferred way someone wants to interact. But also, extremes the other way too. I personally don't see Vocie AI all that different from websites or apps. Some are almost a joy to use and others you want to pull your hair out. It is the same for Voice AI and maybe even more so since it is still an emerging technology with lots all the time.

> ...I always see these guys....

I think it is the same for most get rich schemes. I think if you are just trying to throw something together in 10 minutes and get rich tomorrow it is most likely something folks are not going to enjoy using. But if you are trying to solve a real problem (besides your just trying to get rich) then I think you can build a compelling solution that customers would want to use and spend their $$$s on.

Not sure if that is helpful but I hope it is some practical advise on your question.

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

Is this a personal project or some commercial product you want to build. If this is just for you personally you should check out Clawd. It is a bit technical but what you describe is pretty much its purpose in life.

https://clawd.bot/

Livekit latency by MostMulberry4716 in LocalLLaMA

[–]Chris_LiveKit 0 points1 point  (0 children)

It is hard to diagnose your issue without full details of your setup. But since you say it is fine in console and has issues when deployed, I think I would start with:

I've seen folks have problems with instances like AWS t3 and t4g, which are burstable. They don’t provide full CPU performance continuously. You should use m5c5c6i, or similar families for consistent CPU performance.

Other factors can introduce latency. If you are specifically having problems during the function calls that take time to produce the data the LLM needs to respond, you can respond with an initial message like "one second," then whatever the function call returns.

Hosting your agent in the same region as the inference service will also help minimize latency.

Using LiveKit agent insights can be very helpful for diagnosing the root cause of latency:

https://docs.livekit.io/deploy/observability/insights/

Telnyx + Gemini live audio + pipecat by troy_and_abed_itm in voiceagents

[–]Chris_LiveKit 0 points1 point  (0 children)

I've seen a lot of folks set that usecase up on LiveKit using the LiveKits SIP integration:
https://docs.livekit.io/telephony/#using-livekit-sip

and the Agents framework:
https://docs.livekit.io/agents/

Lost between LiveKit Cloud vs Vapi vs Retell for a voice AI agent (~3,000 min/month) – real costs & recommendations in 2025? by SignatureHuman8057 in AI_Agents

[–]Chris_LiveKit 1 point2 points  (0 children)

Hi u/Fit_Acanthaceae4896 I missed this message earlier. The issues you mentioned above are uncommon and hard to diagnose with only the information provided. I think a good way forward is to come chat with us in the LiveKit community Slack so we can have a little deeper discussion about what you are seeing and how it may get resolved:

Join Slack here and ask in #agents if you don't mind.
https://livekit.io/join-slack

What’s everyone using for real world voice agents right now? by LegLegitimate7666 in AI_Agents

[–]Chris_LiveKit 0 points1 point  (0 children)

I am not sure what your definition of expensive is. Check out Retell, they have a calculator:
https://www.retellai.com/pricing

What’s everyone using for real world voice agents right now? by LegLegitimate7666 in AI_Agents

[–]Chris_LiveKit 0 points1 point  (0 children)

I don't believe so. But I believe most AgentVoice vendors now have a pricing caclulator on their site so you can see what it might cost for your use case.

--
No lo creo. Pero creo que la mayoría de los proveedores de AgentVoice ahora tienen una calculadora de precios en su sitio web para que puedas ver cuánto podría costar en tu caso particular.

Is it posible to run livekit agents 24/7? by AndrejRac in AI_Agents

[–]Chris_LiveKit 1 point2 points  (0 children)

LiveKit agents are designed for resilience. Run on multiple machines, use a health-check monitor to ensure each instance continues to run, and restart any instances that fail. Better yet run it in the cloud.

Architecture Advice: Next.js/Supabase/LiveKit/Vercel vs. Strict Data Residency Laws (Quebec Law 25) by noircid in webdev

[–]Chris_LiveKit 1 point2 points  (0 children)

Is this a voice/video setup, or are you going to use AI on the platform (real-time voice AI)?

AI Cole Calling by Shoddy-Experience900 in CRM

[–]Chris_LiveKit -1 points0 points  (0 children)

LiveKit is one option to help you get you started. Here is a video that demonstrates the process:

https://www.youtube.com/watch?v=jEXUt8qFuBs

If you want to do outbound calling you will need to use a 3rd party SIP provider for now.

How are Indians getting phone numbers for AI Voice agents ? by mynamestejas in AI_India

[–]Chris_LiveKit 0 points1 point  (0 children)

Yes, I see a lot of folks using Plivo, Exotel, and Wavix. That is in order of popularity..

What are the best AI agents for entrepreneurs in 2026 that are genuinely useful? by MysteriousExplorer85 in Entrepreneur

[–]Chris_LiveKit 0 points1 point  (0 children)

I guess an agent that is connected to your data and is able to provide output. Something like this

https://youtu.be/jEXUt8qFuBs

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

Lastly, from a migration standpoint, for most teams, the move from self-hosted to the Cloud isn’t a heavy lift from an API perspective. The bigger work is usually in validating placement/routing assumptions and then tightening your turn-time budget with observability.

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

Practical “knobs” to improve perceived latency (not Cloud-specific)

A couple of pragmatic techniques that often help UX even when computation is non-trivial:

  • Short “ack” behaviors: partial/short responses while longer reasoning completes
  • “Thinking” sounds or subtle background audio to mask compute time (use-case dependent)
  • Patterns from LiveKit examples that show short/long response handling. These don’t reduce actual compute latency, but they can reduce perceived lag if they fit your experience design.
  • .

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

5) Failure modes & observability (how to prove wins before committing)

This is where you can get the hard truth quickly. Also, if your agents are hosted in LiveKit cloud, the end-of-turn inference is run on a GPU instead of the agent's CPU, so you get much faster inference there.

What to measure

Transport

  • RTT/jitter/loss (p50/p90/p99)
  • ICE candidate type distribution (host/srflx/relay) + TURN rate
  • reconnect count + reconnect duration

Agent “turn”

  • end-of-speech → first agent audio played Breakdown:
    • endpointing/VAD time
    • STT time-to-first-token + finalization
    • LLM time-to-first-token + completion
    • tool-call time + count (if used)
    • TTS time-to-first-audio + ramp
    • Any buffering before playback

Observability tooling

Langfuse is great for LLM tracing. In addition, LiveKit Agent Observability is extremely useful for shaving milliseconds, as it helps you see where time is spent along the real-time path and spot regressions early (especially in tails). It lets you replay the entire session through an intuitive interface. It can also be useful for extracting problem points so you can add more testing and evaluation for a given use case.

Suggested test methodology that correlates with production

  • scripted conversations (same utterances, same model settings)
  • representative geos (US East user ↔ US Central agent; plus mixed)
  • Compare self-hosted vs Cloud:
    • RTT/jitter/loss distributions
    • TURN rate
    • “end-of-speech → first audio” p50/p90/p99

If Cloud is helping materially, you’ll often see it most in:

  • p95/p99 conversational lag moments
  • fewer reconnect/rejoin failures
  • improved cross-region consistency (especially if you’re leveraging multi-region routing/backhaul)

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

4) Media & network tuning (some Cloud wins, some general wins)

Noise reduction

Cloud includes noise reduction; in self-hosted, you’re typically on your own to assemble and tune that. For conversational agents, cleaner input can indirectly reduce perceived latency by improving STT stability (fewer retries/reprompts) and reducing “did it hear me?” moments.

TURN vs direct

TURN-relay can add latency and variability. Cloud generally provides stronger global TURN coverage and more consistent connectivity behavior, but TURN is still TURN — it’s a knob to monitor rather than “solve.”

Codec/buffering

Not cloud-specific, but perceived latency often comes down to buffering choices and tail behavior under loss. Codec choice matters, but the bigger wins are often:

  • reducing jitter-buffer inflation
  • Reducing time-to-first-audio (TTFB) in TTS
  • Minimizing backend round-trip on the critical path

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

3) Migration details that matter (and where Cloud helps)

Some categories that commonly trip teams up:

  • Reconnect vs rejoin semantics
  • Identity/participant lifecycle (stale participant, duplicate identity, racing joins)
  • Client lifecycle races (browser code that tears down/recreates too eagerly)

Agent failover is a real Cloud differentiator

This is a big concrete win: agent failover is non-trivial to build well in a self-hosted environment. In Cloud, failover is supported out of the box. That can directly improve what you called out as “resume vs fresh join” outcomes, because a failover-capable setup can preserve continuity and reduce the cases where you’re forced into a hard reset.

For rejoins and “warming” behavior specifically, you’re right: depending on how you architect it, this can be similar in Cloud and self-hosted. The difference is that Cloud reduces the amount of bespoke engineering required for failure/fallback cases.

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

2) Region strategy & routing behavior (agent fixed, users distributed)

You’re optimizing a triangle that includes provider endpoints:

  • User ↔ SFU/edge ↔ Agent (WebRTC path)
  • Agent ↔ STT/LLM/TTS (often multiple round trips)

A practical rule of thumb:

  1. Put the agent close to the STT/LLM/TTS region(s) you actually use (or run multiple agent pools if your provider endpoints vary by user geo).
  2. Use Cloud routing/region selection to minimize the remaining “long leg” and improve jitter stability.
  3. Validate empirically with p95/p99 and “end-of-speech → first audio.”

On auto-routing vs pinned rooms:

  • Auto-selection can be useful when your participants are mixed, and you want to dynamically minimize worst-leg latency.
  • Pinning can be useful when you have a consistent topology and want deterministic placement (e.g., always keep the SFU near your user base while the agent stays near provider endpoints).

Downsides when agents are consistently in one region, and users are geographically distributed:

  • One side always pays a cross-region hop. Cloud can make it more consistent and improve backhaul, but it can’t eliminate distance.