论文部分内容阅读
为了解决由情感变化引起的说话人识别性能下降问题,提出了一种新的情感说话人识别系统. 首先,通过引入情感识别作为前端处理模块,对中性语音和情感语音进行分类. 然后,对情感语音进行韵律修正,分别采用高斯归一化、高斯混合模型( GMM) 和支持向量回归( SVR) 等方法建立情感语音和中性语音的基频映射规则,并根据平均线性变化率对时长进行了修正. 最后,对韵律修正后的情感语音进行识别. 实验结果表明,提出的情感说话人识别系统可以有效地提高情感说话人识别的性能,识别率相比传统方法有了显著的提高. 并且通过基频和时长修正的情感语音更接近于中性语音.
In order to solve the problem of speaker recognition performance degradation caused by emotional changes, a new emotion speaker recognition system is proposed.Firstly, the neutral speech and emotional speech are classified by introducing affective recognition as the front-end processing module.Secondly, Emotional speech were prosodic corrected. The Gaussian normalization, Gaussian Mixture Model (GMM) and Support Vector Regression (SVR) were used to establish the fundamental frequency mapping rules of emotional speech and neutral speech, respectively, and the length of speech was adjusted according to the average linear rate of change Finally, the prosody modified emotion speech is identified.The experimental results show that the proposed emotion speaker recognition system can effectively improve the performance of emotional speaker recognition, the recognition rate has been significantly improved compared with the traditional method. Emotional speech corrected by fundamental frequency and duration is closer to neutral speech.