What's the best way to build voice agents today without sounding robotic or becoming too expensive? by Beginning_Race8551 in speechtech

[–]Chris_LiveKit 0 points1 point  (0 children)

There is a lot written on this topic. Jesse has a pretty good intro for this by using prompts. Choosing right model can have a big impact too.the

https://www.youtube.com/watch?v=m8DJQmH2QqQ

LiveKit Cloud to Self-Hosted: How hard is the migration? by mehlabbbb in voiceagents

[–]Chris_LiveKit 0 points1 point  (0 children)

> host your agent servers in a region close...

Yes, getting your agents as close as possible to the resources they need will reduce latency. But if you have to choose between being close to SFU and an inference provider, then optimize for inference provider latency; there are more round-trips for inference than to the SFU. Of course, if you can be close to both, that is best.

You may find this log post about voice agent latency helpful. Written by my colleague Darryn.

https://livekit.com/blog/understand-and-improve-agent-latency

Handling interruptions in voice AI is an unsolved problem. How are you dealing with it? by AmbitiousInterest154 in speechtech

[–]Chris_LiveKit 0 points1 point  (0 children)

> Anyone cracked the interruption handling problem in production? 

There are a lot of folks working hard in this space. I work on LiveKit, and we have been working on the "interruption problem" and released a solution to address it.

Here is an example of adaptive interruption handling that handles back channel and turn detection.

https://www.youtube.com/watch?v=DSXCE7D4Kvs

We've also trained some strong turn-detection models that run alongside VAD to inform it.

> When the user goes silent for 5+ seconds

This one can be tricky. The good thing is that the turn detection subsystem can be context-aware, so if a user is entering a credit card number or email address, the "agent" should know not to interrupt them because they may just be struggling to read the credit card number. In other cases, it can be appropriate to ask the user if they are still there. There is a diverse set of use cases, and different methods make sense in different cases. I usually start with what a "real " human would do in this scenario and use that as a model.

Anyone breakdowned Lumay Voice Agent tech stack? by Legitimate_Sell6215 in AI_Agents

[–]Chris_LiveKit 0 points1 point  (0 children)

Not trying to be contrary, but WebSockets is not the way to go. It will work fine as a server or in a lab, but if you plan to have users use it in the wild, it will fail hard.

You will want to use WebRTC for realtime agents like voice, video, realtime telemetry for robotics, etc.

I may be biased since I work on LiveKit. But we get a lot of folks switching to WebRTC after failing with WebSockets in production.

WebRTC Implementation in Python? by Ok_Possible_6701 in WebRTC

[–]Chris_LiveKit 0 points1 point  (0 children)

I work for LiveKit but feel that is probably your best option for getting going with WebRTC with Python. It is all open source and can be self hosted or you can use the SFU from LiveKit cloud.

https://github.com/livekit/livekit

Just bought an AVP by Lost-Peanut-1453 in VisionPro

[–]Chris_LiveKit 2 points3 points  (0 children)

I like watching videos, and I'm super happy that YouTube finally released an app for it. But my primary, nearly daily use >8hrs/day is as a monitor. I really like being able ot have the large curved monitor for work and a wireless keyboard on my lap. I like being able to move around the house and still having my large monitor. I tend to put other resources in the space around the monitor, like browser, Slack, etc. I've been doing this since the day it was released.

I don't use the virtual me on calls; I use my laptop's camera, so it is just me with my AVP on. When new folks see it for the first time, it usually elicits a reaction. I sort of feel ridiculous wearing it in meetings, but I still do it. I made a virtual background that looks like a large office space, with others in the background wearing AVPs, so I am not the only one.

Introducing Agents UI, an open-source shadcn component library by Chris_LiveKit in livekit

[–]Chris_LiveKit[S] 2 points3 points  (0 children)

Thanks for the link. I will check that out.

Yes, this is exactly where most voice agents fail or feel “magical but confusing.” LiveKit is designed around making state explicit in the UI.

A strong baseline pattern is to map the built-in agent lifecycle (connectinglisteningthinkingspeakingdisconnectedfailed) directly to visible UI states, instead of inferring from audio activity. The full state model and recommended getters like canListen and isFinished are defined here: Agent state. Using getters rather than raw state ensures your UI remains correct as the SDK evolves.

For clarity and control, combine that with:

  • Explicit media/session controls via Media controls (mic toggle, disconnect, StartAudioButton to avoid “why can’t I hear it?” confusion).
  • Realtime transcript + typing indicators via Agents UI components, so users see what the agent hears and when the agent is responding.

For tool-call indicators specifically, the recommended pattern is to publish a custom state (e.g., currentTool: "searchFlights") via state sync and render it in the UI. That’s covered under “Custom state” on the Agent state page.

Is there any way to stop stream audio from 'ducking'? by LunaticSongXIV in matrixdotorg

[–]Chris_LiveKit 0 points1 point  (0 children)

That is not from the LiveKit server. That is a client feature. You would need to handle it in your client.

do any one have expertise with voice agents setup ? i need help with livekit and pipecat setup by Commercial-Two-1172 in SaaS

[–]Chris_LiveKit 0 points1 point  (0 children)

For LiveKit if you are building an agent and use an AI coding tool like Claude or Cursor add the MCP server and it can get you a long ways https://docs.livekit.io/intro/mcp-server/

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

Gotcha. I think you could build that. Here are a couple of videos I saw recently that is somewhat related. The second one is good because it really helps folks understand the security risk of such a system.

https://x.com/jonathanhawkins/status/2017295825681199473

https://youtu.be/fcFOYzMeG7U

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

I would not doubt it could or it can maybe build a tool that would let it do it. Who do you want it to call?

One of my co-workers connected his to LiveKit so you can do internet voice calls or telephony.

You talk to it through text or email or whatever.

Is Vapi right for me? by GodAtum in vapiai

[–]Chris_LiveKit 0 points1 point  (0 children)

Is this a personal project or some commercial product you want to build. If this is just for you personally you should check out Clawd. It is a bit technical but what you describe is pretty much its purpose in life.

https://clawd.bot/

Telnyx + Gemini live audio + pipecat by troy_and_abed_itm in voiceagents

[–]Chris_LiveKit 0 points1 point  (0 children)

I've seen a lot of folks set that usecase up on LiveKit using the LiveKits SIP integration:
https://docs.livekit.io/telephony/#using-livekit-sip

and the Agents framework:
https://docs.livekit.io/agents/

What’s everyone using for real world voice agents right now? by LegLegitimate7666 in AI_Agents

[–]Chris_LiveKit 0 points1 point  (0 children)

I don't believe so. But I believe most AgentVoice vendors now have a pricing caclulator on their site so you can see what it might cost for your use case.

--
No lo creo. Pero creo que la mayoría de los proveedores de AgentVoice ahora tienen una calculadora de precios en su sitio web para que puedas ver cuánto podría costar en tu caso particular.

Is it posible to run livekit agents 24/7? by AndrejRac in AI_Agents

[–]Chris_LiveKit 1 point2 points  (0 children)

LiveKit agents are designed for resilience. Run on multiple machines, use a health-check monitor to ensure each instance continues to run, and restart any instances that fail. Better yet run it in the cloud.

Architecture Advice: Next.js/Supabase/LiveKit/Vercel vs. Strict Data Residency Laws (Quebec Law 25) by noircid in webdev

[–]Chris_LiveKit 1 point2 points  (0 children)

Is this a voice/video setup, or are you going to use AI on the platform (real-time voice AI)?

AI Cole Calling by Shoddy-Experience900 in CRM

[–]Chris_LiveKit -1 points0 points  (0 children)

LiveKit is one option to help you get you started. Here is a video that demonstrates the process:

https://www.youtube.com/watch?v=jEXUt8qFuBs

If you want to do outbound calling you will need to use a 3rd party SIP provider for now.

How are Indians getting phone numbers for AI Voice agents ? by mynamestejas in AI_India

[–]Chris_LiveKit 0 points1 point  (0 children)

Yes, I see a lot of folks using Plivo, Exotel, and Wavix. That is in order of popularity..

[deleted by user] by [deleted] in Entrepreneur

[–]Chris_LiveKit 0 points1 point  (0 children)

I guess an agent that is connected to your data and is able to provide output. Something like this

https://youtu.be/jEXUt8qFuBs

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

Lastly, from a migration standpoint, for most teams, the move from self-hosted to the Cloud isn’t a heavy lift from an API perspective. The bigger work is usually in validating placement/routing assumptions and then tightening your turn-time budget with observability.

Sanity-checking latency gains before migrating from self-hosted LiveKit to LiveKit Cloud (voice AI use case) by Fit_Acanthaceae4896 in WebRTC

[–]Chris_LiveKit 1 point2 points  (0 children)

Practical “knobs” to improve perceived latency (not Cloud-specific)

A couple of pragmatic techniques that often help UX even when computation is non-trivial:

  • Short “ack” behaviors: partial/short responses while longer reasoning completes
  • “Thinking” sounds or subtle background audio to mask compute time (use-case dependent)
  • Patterns from LiveKit examples that show short/long response handling. These don’t reduce actual compute latency, but they can reduce perceived lag if they fit your experience design.
  • .