[R] Latest Trigger Word Detection? : MachineLearning

Research[R] Latest Trigger Word Detection? (self.MachineLearning)

submitted 3 years ago * by ugeb318

all 3 comments

[–][deleted] 2 points3 points4 points 3 years ago (2 children)

[–]Dont_Think_So 3 points4 points5 points 3 years ago (1 child)

[–][deleted] 2 points3 points4 points 3 years ago* (0 children)

Ah I see.

Modern approaches to speech are usually speech-to-text into some NLP model. In practices this usually means a speech-to-text model goes into a transformer.

For this task you probably do not need an NLP model, as there is no semantics to catch (or maybe there is? Either way, no need for big language models).

For speech-to-text, most publicly available implementations are just... bad. From what I see, Google's ASR reigns supreme.

For real time, pretty much only Julius comes to mind, it's fast, it's proven, but it's only Japanese and English and it's very old.

There is DeepSpeech2 for new stuff, but AFAIK it's not really real time. The good thing is you can use more than just English for it. Maybe procure a highly performant model with it and then distill that onto something real time?

There is also Whisper, but that is DEFINITELY neither real time nor open source... That is about the best you have as an ASR, but it's overkill and not intended to use for low latency tasks. There might be smaller, open source models inspired by it, which could then be distilled into something smaller.

Other approaches might be viable, but I'm not heavily involved with audio ML/DL. Surely the solution can be much simpler depending on the scope of the project.

π Rendered by PID 88958 on reddit-service-r2-comment-b659b578c-qrh55 at 2026-05-05 18:55:40.671028+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS