论文部分内容阅读
粗糙集理论以其独特的数据约简能力在不确定信息处理的相关领域得到广泛关注和研究,而连续属性的离散化是粗糙集方法及其它归纳学习系统中的重要环节.将离散化视作一种信息概括、抽象和约简,利用粗糙集理论提出一种全局的离散化算法.算法通过定义一致性度量,实现全局离散,弥补了局部离散化 MDLP 方法引入不一致的缺陷.然后在保持一致性前提下,进一步对离散中分割点的冗余进行约简.实验采用 ID3和粗糙集分类工具ROSETTA 在多个大数据集上对提出的离散方法进行分类验证,实验结果表明该算法的有效性和优越性.
Rough set theory has drawn much attention and research in the related fields of uncertain information processing due to its unique ability of data reduction, and discretization of continuous attributes is an important link in rough set method and other inductive learning systems. Discretization is regarded as A kind of information summarization, abstraction and reduction, the use of rough set theory to propose a global discretization algorithm.Calculation by defining the consistency measure, to achieve global dispersion, to make up for the local discretization MDLP method to introduce inconsistencies.And then in maintaining consistency The redundancy of discrete points is further reduced.The experiment uses IDET and Rough Set Classification Tool ROSETTA to classify the proposed discrete methods on multiple large data sets.The experimental results show that the algorithm is effective and efficient Superiority.