论文部分内容阅读
语音帧在声学特征空间中的位置信息可以辅助解码器对潜在路径进行筛选。传统的语音识别系统缺乏利用这种位置信息。针对这种不足,本文提出一种引导概率模型,用于描述语音帧属于声学特征空间不同局部的概率,并将其用于识别。使用引导概率后,解码器更强调对声学特征空间中最有希望的局部进行搜索,保留并扩展通过此局部空间的路径,同时弱化不经过此局部空间的路径。实验结果显示,融合引导概率的解码算法在不显著增加解码复杂度的情形下,使汉字相对错误率下降10.95%。结果分析表明,融合了语音帧声学位置信息的解码方法能够更有效地鉴别潜在路径,从而降低误识率。
The position information of the speech frame in the acoustic feature space can assist the decoder in screening the potential path. Traditional speech recognition systems lack the use of such location information. To solve this problem, this paper proposes a guidance probability model, which is used to describe the probability that speech frames belong to different parts of the acoustic feature space and used for identification. After using the pilot probability, the decoder emphasizes searching for the most promising part in the acoustic feature space, preserving and expanding the path through this local space, and weakening the path not going through this local space. Experimental results show that the fusion algorithm can reduce the relative error rate of Chinese characters by 10.95% without significantly increasing the decoding complexity. The result shows that the decoding method which combines the acoustic position information of speech frame can identify the potential path more effectively and reduce the misclassification rate.