论文部分内容阅读
为更好地对听视觉情感信息之间的关联关系进行建模,提出一种三流混合动态贝叶斯网络情感识别模型(T_AsyDBN)。采用MFCC特征及基于基频和短时能量的局域韵律特征作为听觉输入流,在状态层同步。将面部几何特征和面部动作参数特征作为视觉输入流,与听觉输入流在状态层异步。实验结果表明,该模型优于有状态异步约束的听视觉双流DBN模型,6种情感的平均识别率从52.14%提高到63.71%。
In order to better model the association of visual emotion information, a three-flow hybrid dynamic Bayesian network emotion recognition model (T_AsyDBN) is proposed. The MFCC features and local rhythm characteristics based on fundamental frequency and short-time energy are used as auditory input streams and are synchronized at the state level. The facial geometry and facial motion parameters are taken as the visual input stream, and the auditory input stream is asynchronous at the state layer. Experimental results show that the proposed model is superior to the audiovisual dual stream DBN model with asymptotic constraints, and the average recognition rate of six emotions is increased from 52.14% to 63.71%.