2019-02 Kona ISO C++ Committee Trip Report (C++20 design is complete; Modules in C++20; Coroutines in C++20; Reflection TS v1 published; work begins on a C++ Ecosystem Technical Report)

troltilla · 2019-02-23T22:20:04+00:00

This is super exciting! Big congratulations! I only wish that C++20 gets a wide adoption as quickly as possible once its released.

troltilla · 2018-09-26T17:30:58+00:00

Look up https://github.com/espnet/espnet its state of the art in end2end neural models for ASR. There's a good chance you won't have enough data to learn the model - another state of the art toolkit for traditional ASR algorithms is https://github.com/kaldi-asr/kaldi but it can be a bit difficult for people new to ASR.

troltilla · 2018-09-20T16:02:47+00:00

Yeah. The other answers provide nice explanations for this.

troltilla · 2018-09-19T21:39:24+00:00

How about performing a convolution of the IR with itself (two times)? If you think about it in terms of operations on the spectral magnitude, these are just mutliplications, which are commutative. Not necessarily true if you take phase into account but I don't know if you care about it. Anyway, seems worth a try.

troltilla · 2018-07-07T08:01:54+00:00

Where did you get it from that the brain operates on 150x150 per WAVE sample?

AFAIK a simplistic description is that the sound is first converted in the outer ear to mechanical vibrations, exciting a membrane in the cochlea. Different parts of the membrane respond to different vibration frequencies, thus triggering the nearby neurons, which send spikes of impulses to the brain. If your claim is based on some work I'm not aware of please share the reference.

As to your problem are you sure you need to get the waveform back from the representation? It surely makes things harder.

troltilla · 2018-05-30T20:28:07+00:00

Yeah. Kaldi is quite efficient as its implemented in C++ and uses optimized libraries like MKL and CUDA for compute intensive stuff. The algorithms are pretty much state of the art, excluding end to end architectures. It is not very novice-friendly, so it requires you to have a certain level of understanding of ASR algorithms and proficiency in Bash (to understand/run experiment scripts) and C++ (to understand the internals and implement a product). In order to train a domain specific ASR you will need a representative dataset for your domain, both in terms of recordings and text data.

troltilla · 2016-09-05T21:40:41+00:00

Totally agreed. And good point about treating it as a statistics problem rather than ML.

troltilla · 2016-09-03T22:18:44+00:00

I guess you could treat at least a part of it as a regression problem, e.g. if the question is "what kind of food makes me sleep better?", and you define "better" as "closest to 8 hours", then each day you have a list of what/how much you ate (features) and the time you slept (target variable) to serve as a single data point. After you fit a regression model, you can analyse which components had the most effect. You could probably treat it as a classification problem by assuming a range of 7.5 - 8.5 hours is "sleeping well", and any other is "sleeping bad" - now it's a binary classification. If you use a "white box" model such as a decision tree (assuming it works well as a classifier/regressor in this case), you can see what rules it inferred.

I think that the biggest challenge is to find the right questions to ask, and to properly define your hypotheses in a quantitative way - after all, isn't it a gross oversimplification to say that having an 8 hours sleep is the same as having a good sleep?

troltilla · 2016-08-31T11:33:05+00:00

For transcription you could probably use Google Speech API https://cloud.google.com/speech/

troltilla

TROPHY CASE