论文部分内容阅读
针对电信客户流失预测问题的复杂性,融合自组织神经网络良好的连续属性值离散化优势、粗糙集理论出色的属性约简功能和蚁群优化算法全局的随机搜索特点,在模型集成技术和成本敏感学习理论的基础上,提出了一种新的基于蚁群算法的成本敏感线性集成多分类器的电信客户流失预测模型。构建该集成模型可分为4个阶段:(1)连续属性值的离散处理:利用自组织神经网络对连续属性值进行非监督离散化处理;(2)原始属性集的约简处理:使用粗糙集理论按属性重要性原则对离散属性进行约简;(3)子分类器的建立:分别使用NaiveBayes、Logistic回归、多层感知器和决策树等4种差异性很大的分类技术在约简属性集上建立4个对应的客户流失预测子分类器;(4)子分类器的集成:基于成本敏感学习理论,构建了4种不同的线性集成模型,采用蚁群优化算法求解集成模型的最优线性组合权重系数。将该模型应用于某电信客户流失预测,其实验结果表明该集成方法是可行且有效的。
Aiming at the complexity of predicting customer churn, this paper combines the advantage of good continuous attribute value discretization of self-organizing neural network, the excellent attribute reduction function of rough set theory and the global random search of ant colony optimization algorithm. Based on the model integration technology and cost Based on the sensitive learning theory, a new cost-sensitive linear integrated multi-classifier based on ant colony algorithm is proposed to predict the loss of telecom customers. Constructing the integrated model can be divided into four stages: (1) Discrete processing of continuous attribute values: Unsupervised discretization of continuous attribute values by using self-organizing neural network; (2) Reduction processing of original attribute set: using rough Set theory reduces the discrete attributes according to the principle of attribute importance. (3) The establishment of sub-classifier: Using the four different classification techniques such as NaiveBayes, Logistic regression, multi-layer perceptron and decision tree respectively, (4) Integration of sub-classifiers: Based on the cost-sensitive learning theory, four different linear integration models are constructed and the ant colony optimization algorithm is used to solve the optimal model of the integrated model Excellent Linear Combination Weighting Coefficient. The model is applied to predict the loss of a telecommunication customer. Experimental results show that this method is feasible and effective.