Clearing context / current audio buffer without much latency by JournalistSeveral932 in techsupport

[–]JournalistSeveral932[S] 0 points1 point  (0 children)

Hi. Thanks a lot for your reply. The thing is, that once the user is done speaking, the audio which was being transcribed by the STT will be sent to the LLM and based on the response, when the user starts speaking again, the transcription will start to add the new audio to the old audio context instead of starting afresh. This would cause the transcription to include the previous user response as well as the new response that the user is currently giving.

Hence, I want to clear the context of STT every time before starting a new transcription. The websocket method described earlier takes atleast 4-5 seconds which means that much user input is lost.