论文部分内容阅读
多类分类问题是数据挖掘和机器学习领域中一个重要且正在进行研究的课题。最近对该问题提出了一种具有新型结构的K-SVCR方法。与其他方法相比较,此方法最大的优点在于在训练的过程中,能够利用训练数据的所有信息。然而,它又和“一对一”方法一样,对某一个K类分类问题,需要求解K(K-1)/2个二次规划问题,才能把一个模式指派到一个适当的类别中。因此建立一个快速有效的训练算法是非常重要的。在本文中,我们首先在K-SVCR方法的基础上提出了新的模型,然后把新模型转化成一个互补问题,并利用Lagrangian隐函数进一步转化成一个强凸的无约束优化问题。并且为它建立了一个快速地Newton算法。该算法具有全局收敛和有限步终止的性质。同时通过Sherman-Morriaon-Woodbury等式,将算法中需要处理的$ltimesl$矩阵(其中l是模式的总量)转变成$(n+1)times(n+1)$的矩阵(其中n是模式的维数)。对于很多多类分类问题,n远远小于l,这也说明可以有效地实现该算法。初步的实验结果表明该算法在分类的准确度和训练速度方面都有很好的表现。
The multi-class classification problem is an important and ongoing research topic in the field of data mining and machine learning. Recently, a K-SVCR method with a new structure has been proposed for this problem. Compared with other methods, the biggest advantage of this method is that all information of the training data can be utilized during the training. However, like the “one-to-one” approach, it is necessary to solve K (K-1) / 2 quadratic programming problems for a K-type classification problem in order to assign a pattern to a suitable category . Therefore, it is very important to establish a fast and effective training algorithm. In this paper, we first propose a new model based on the K-SVCR method, then transform the new model into a complementary problem, and use the Lagrangian implicit function to further transform into a strongly convex unconstrained optimization problem. And it built a fast Newton algorithm for it. The algorithm has the properties of global convergence and finite-step termination. Meanwhile, the $ l timesl $ matrix (where l is the total number of modes) to be processed in the algorithm is converted into $ (n + 1) times (n + 1) $ by the Sherman- Morriaon-Woodbury equation Matrix (where n is the number of dimensions of the pattern). For many multi-class classification problems, n is far less than l, which also shows that the algorithm can be effectively implemented. The preliminary experimental results show that the algorithm has a good performance in terms of classification accuracy and training speed.