论文部分内容阅读
混合属性数据集是现实世界特别是商业金融数据库中最普遍的数据集类型,但适用于这类数据集的聚类算法极少。该文根据聚类融合的方法体系,针对混合属性数据集的特点,提出了基于聚类融合的混合属性特征聚类算法(CEM C),建立了算法框架,列出了算法目标函数和算法主要步骤,并分析了算法复杂度。该算法可以有效处理混合属性海量数据集。用真实数据集验证了算法,并将此算法应用于实际的客户关系管理数据分析中,得到了较好效果。
Mixed attribute datasets are the most common types of datasets in the real world, especially in commercial financial databases, but there are very few clustering algorithms suitable for such datasets. According to the clustering fusion method system, according to the characteristics of the mixed attribute dataset, a hybrid clustering based hybrid feature feature clustering algorithm (CEM C) is proposed, and the algorithm framework is established. The algorithm objective function and algorithms are listed Steps, and analyzes the complexity of the algorithm. The algorithm can effectively deal with mixed attributes of massive data sets. The algorithm is validated by the real data set, and the algorithm is applied to the actual customer relationship management data analysis, and has achieved good results.