论文部分内容阅读
本文提出了在特征提取阶段利用帧间相关性的一种方法。对每一帧考虑其前后各n帧,加上自身帧共2n+1帧的特征矢量串起来组合成一个大的特征矢量串。对这个大的特征矢量串用Karhunen-Loeve变换进行降维处理,将变换后的数据作为本帧的特征矢量用于后续的训练和识别。在基于CDCPM的语音识别系统中采用这种方法进行了音节的训练和识别,实验结果表明Karhunen-Loeve变换在考虑帧间相关性的特征提取阶段上表现了良好的效果,有着很广阔的应用前景。
This paper presents a method of utilizing the inter-frame correlation in the feature extraction stage. Consider each n frames before and after each frame, together with its own frame a total of 2n + 1 frame string together into a large eigenvector vector string. This large feature vector string is reduced by the Karhunen-Loeve transform, and the transformed data is used as the feature vector of this frame for subsequent training and recognition. Syllable training and recognition are implemented in this system based on CDCPM. Experimental results show that the Karhunen-Loeve transform has a good effect in the feature extraction stage which considers the inter-frame correlation and has a very broad application prospect .