论文部分内容阅读
针对传统的隐含马尔可夫模型(hidden Markov model,HMM)存在的缺陷,该文提出了一种在识别的后处理阶段使用段长模型的方法,并应用在基于HMM的汉语识别系统上。该方法利用归一化的段长模型对识别系统的解码结果重新打分,比较前后两次算出的分数从而选出更可靠的识别结果。实验表明,通过该方法将段长模型应用在识别过程中,可以显著提高识别系统的性能,大量减少识别结果中的插入错误。数据显示,该方法使识别系统的音节错误率下降了大约10%,识别系统最终的插入错误和删除错误都低于1%。
In order to overcome the shortcomings of the traditional HMM, this paper proposes a method of using the segment length model in the post-processing stage of recognition, which is applied to the HMM-based Chinese recognition system. The method uses the normalized segment length model to re-grade the decoding result of the recognition system, and compares the scores calculated before and after twice to select a more reliable recognition result. Experiments show that the method of segment length model used in the identification process, can significantly improve the performance of the identification system, a substantial reduction in the identification results of the insertion error. The data show that this method reduces the syllable error rate of the recognition system by about 10% and the recognition system has a final insertion error and deletion error of less than 1%.