论文部分内容阅读
目前处理多值属性多类标数据的算法有多值多类标分类器(MMC)和多值多类标决策树(MMDT).本文在研究前面两种算法的基础上提出新的相似度计算公式 sim_3,并通过改进 MMDT 基于一致性的评定方法,提出一种处理多值属性多类标数据的算法 SCC_SP,综合考虑两个多类标集合的相似性和一致性,更有利于选择最佳分裂属性.通过对比实验证明,在相同的预测机制下,SCC_SP 的预测准确度比 MMDT 高,能更好地处理多值属性多类标数据.
At present, the multi-value multi-class classifier (MMC) and multi-class multi-class standard decision tree (MMDT) are the most effective algorithms to deal with multi-class attribute data.This paper proposes a new similarity calculation based on the previous two algorithms Formula sim_3, and by improving MMDT consistency-based assessment method, a new algorithm SCC_SP is proposed to deal with multi-criteria data with multi-valued attributes. It considers the similarity and consistency of two multi-class sets synthetically, The experimental results show that the SCC_SP has higher prediction accuracy than MMDT under the same prediction mechanism and can better deal with multi-value attribute multi-class standard data.