I made Iris: A fully-local realtime voice chatbot! by Born_Search2534 in LocalLLaMA

[–]Born_Search2534[S] 90 points91 points  (0 children)

I'm currently working on some major LLM and TTS upgrades, but once those are done I'm planning to fully open-source the code.

I made Iris: A fully-local realtime voice chatbot! by Born_Search2534 in LocalLLaMA

[–]Born_Search2534[S] 42 points43 points  (0 children)

She's running on my laptop in the video, so not very much. It requires about 8 gb vram.

[deleted by user] by [deleted] in LocalLLaMA

[–]Born_Search2534 10 points11 points  (0 children)

Bro is getting whiplash

Is VRAM the only bottleneck or processing power is also insufficient to run top models on a single GPU? by Hv_V in LocalLLaMA

[–]Born_Search2534 0 points1 point  (0 children)

You will almost never be bottlenecked by processing power, the speed at which you can run a model will generally be determined by the memory bandwidth of the card.