论文部分内容阅读
对于大规模数据量的语音识别问题,支持向量机的训练成为一个难题。预选取支持向量是解决这一难题的方法之一。提出一种新的支持向量预选取算法.一方面对原数据集的每类数据分别进行核模糊C均值聚类,将所有的聚类中心作为每类数据的表征集;另一方面根据支持向量的几何分布含义并借鉴支持向量机的多类分类算法中一对一方法的思路提取原数据集的边界样本作为预选取支持向量进行训练和预测,并将该算法应用于嵌入式语音识别系统中,实验结果表明:该方法提高了语音识别系统的训练效率,降低了计算代价,同时保持了较高的识别率。
The training of support vector machines becomes a challenge for large-scale voice recognition of data volumes. Pre-select support vector is one of the ways to solve this problem. A new support vector pre-selection algorithm is proposed.On the one hand, each type of data in the original dataset is respectively subjected to kernel-fuzzy C-means clustering, and all cluster centers are used as the characterization set of each type of data; on the other hand, And draws lessons from the idea of one-to-one method in the multi-class classification algorithm based on SVM to extract the boundary samples of the original data set as the pre-selected support vector for training and prediction, and apply the algorithm to the embedded speech recognition system The experimental results show that this method improves the training efficiency of speech recognition system and reduces the computational cost, while maintaining a high recognition rate.