This article briefly describes the principles, modeling and testing process of GMM-HMM in speech recognition.
1. What is Hidden Markov Model?
Three problems to be solved by HMM:
1) Likelihood
2) Decoding
3) Training
2. What is GMM? How to use GMM to find the probability of a phoneme?
3. GMM + HMM Dafa solves speech recognition
3.1 Identification
3.2 Training
3.2.1 Training the params of GMM
3.2.2 Training the params of HMM
================================================== ==================
1. What is Hidden Markov Model?
ANS: A Markov process with unobservable and visible nodes (see detailed explanation).
Hidden nodes represent the state, and visible nodes represent the speech we hear or the timing signals we see.
At the beginning, we specify the structure of this HMM. When training the HMM model: given n time series signals y1 ... yT (training samples), use MLE (typically implemented in EM) to estimate the parameters:
1. The initial probability of N states
2. State transition probability a
3. Output probability b
--------------
In speech processing, a word is composed of several phoneme (phonemes);
Each HMM corresponds to a word or phoneme
A word is expressed as several states, and each state is expressed as a phoneme
There are three problems to be solved with HMM:
1) Likelihood: the probability of an HMM generating a string of observaTIon sequence x <the Forward algorithm>
Among them, αt (sj) means that HMM is in state j at time t, and observeaTIon = {x1 ,. . ., Xt} Probability
,
aij is the transition probability from state i to state j,
bj (xt) represents the probability of generating xt in state j,
2) Decoding: Given a string of observaTIon sequence x, find the most likely subordinate HMM state sequence <the Viterbi algorithm>
Pruning will be done in the actual calculation, instead of calculating the probability of each possible state sequence, but using Viterbi approximaTIon:
From time 1: t, only the state and probability with the highest transition probability are recorded.
Let Vt (si) be the maximum probability that the state is j at all times from t-1 to t:
Remember Is: from which state at time t-1 to state t at time t has the highest probability;
The Viterbi approximation process is as follows:
Then according to the most likely transition state sequence recorded Backtracking:
3) Training: Given an observation sequence x, train the HMM parameter λ = {aij, bij} the EM (Forward-Backward) algorithm
This part we put in "3. GMM + HMM Dafa to solve speech recognition" and talk with GMM training
-------------------------------------------------- -------------------
Connector 2.00Mm Pitch,Ph Connector Accessories,Ph Connectors Accessories,Strip Wire Connectors
YUEQING WEIMAI ELECTRONICS CO.,LTD , https://www.wmconnector.com