论文部分内容阅读
在众多数据降维方法中,偏最小二乘降维方法是一种非常有效的数据降维模型,并被广泛应用于生物基因数据分析等领域。但基于偏最小二乘降维的分类模型的选择问题,往往为以往的研究工作所忽视,研究者基本是根据自身喜好选择不同的分类模型。针对这一问题,本文通过大量的实验,对多种不同分类模型在生物基因芯片数据集上的性能进行了比较和分析。通过t检验,发现人工神经网络、逻辑斯特判别、线性支持向量机是3种在偏最小二乘降维上性能较好的的分类模型。
In many data dimension reduction methods, partial least square dimensionality reduction method is a very effective data dimension reduction model and is widely used in the field of biological gene data analysis. However, the selection of classification models based on partial least square dimensionality is often neglected by the previous research work. The researchers basically choose different classification models according to their own preferences. In response to this problem, this paper through a large number of experiments, the performance of a variety of different classification models in the bio-gene chip data set were compared and analyzed. By t-test, it is found that ANN, Logistic Discriminant and Linear Support Vector Machine are the three classification models which have good performance in partial least square dimensionality reduction.