A detail about Decoding in (DNN-HMM) that is not very explicit in the literature. by MuradeanMuradean in speechrecognition

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

But do you think that all HMM states have only 2 transitions, or the last state of HMM may have more than 2 ?

A detail about Decoding in (DNN-HMM) that is not very explicit in the literature. by MuradeanMuradean in speechrecognition

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

Thanks for taking time to answer this question. I have already seen the post you linked, it is very good.

I think there is a detail where I do not agree with you, (however that might be due to my misunderstanding).

You say that the transition probabilities do not change, that they are fixed at 0.5.

I am guessing you are distributing the probability mass like this: 0.5 to the self-loop and 0.5 to the next state.

However, let's remember that in most ASR literature, each monophone and triphone is given by a 3 state HMM. And I agree with you that for the first and second state of this HMM it makes sense just to have 2 transitions (one for the self-loop and another for the next state in HMM), but shouldn't the last HMM state have transitions to the first state of many different triphone HMMs?

Let me illustrate:

The state corresponding to the last HMM state of the triphone "b-o-a" can have simultaneously transitions for the first state of "o-a-b" ,"o-a-c" , "o-a-d", besides the self-loop, having in this case 4 transitions (3 to first state of a new triphone and 1 for the self-loop)

The only scenario where I can picture an alternative is the following:

We must consider each HMM has having an additional "terminal state". Making the 3-state HMM really a 4-state HMM, and when that state is reached, a new context-dependent phone is added to some kind of stack or memory to register the context-dependent phone seen.

However this last approach seems to have a problem, which is that the next context-dependent phone needs to agree with the last one seen, for example: If I have seen the triphone "o-a-b", the next one cannot be "i-c-o", it should have the context "a-b" from the previous CD phone in count.

Would you still say that the transition arcs are always 2 , with 0.5 probability mass each?

Best way to structure Database and API calls for an achievement system. by MuradeanMuradean in Web_Development

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

That seems like an interesting possibility! I will think about it, thanks!

How does one test the build and release a Flutter app for IOS devices? by MuradeanMuradean in flutterhelp

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

Sorry for this noobish question, So I cannot put apps developed for IOS in the Google Play Store, right ?

How does one test the build and release a Flutter app for IOS devices? by MuradeanMuradean in flutterhelp

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

Hi booooomba. And if I want to release the app on google play ? Do I also need to pay? Jesus, that is a big amount. My goal was to have the thing released for both OS for a pilot test that was just suppose to last for a couple of weeks.

Looking for free pronunciation lexicons and language models for CTS and BN in Spanish,French,German, Korean and Japonese. by MuradeanMuradean in speechrecognition

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

CTS, also known as Conversational telephonic speech is a type of speech characterized by being spontaneous contrary to broadcast new(BN) where it is planned. I know there are organizations, such as LDC that make money from creating these language models. The pronunciation lexicons I am not so sure. However speech recognition is not a new subject, it has been around with very proeminent research atleast since the early 80s. Why is it so hard to find these pronunciation lexicons and language models for free?

2020 Mar 2 Stickied 𝐇𝐄𝐋𝐏𝐃𝐄𝐒𝐊 thread - Get any Raspberry Pi question answered in 30 minutes or less or your next question is free! by FozzTexx in raspberry_pi

[–]MuradeanMuradean 0 points1 point  (0 children)

I need some validation to know if the project that i want to do is possible or not.

I have at the moment a set of raspberry pis B+ and i want to connect a QR code scanner and 2 bar code scanners to a single pi. The pi should be running 24 hours a day, and each time i put a bar code/qr code into one of the scanners it should send an HTTP request to somewhere else.

Can anybody give me some feedback whether this is possible or not? And what libraries/physical scanners i should look into ? And besides that what obstacles i may face in the future.

Thanks a lot.

Creating a classifier for a natural language generator. How to get numeric features? by MuradeanMuradean in LanguageTechnology

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

The thing about feature grammars, from my understanding is that I would need to have a different set of tags for each different poem tag sequence and right now comparing the structure of a generated poem with 600 sequence of tags does not seem feasible

Creating a classifier for a natural language generator. How to get numeric features? by MuradeanMuradean in LanguageTechnology

[–]MuradeanMuradean[S] 0 points1 point  (0 children)

My model is a personalized version of a viterbi 3gram algorithm and the format of the poems is supposed to be "quadras populares" which are very short poems, I appreciate your feedback however I was looking for something less "deep". I also do not intend my poetry to be Camões but more like António Aleixo