论文部分内容阅读
以DNA微阵列数据分类为应用背景,提出一种GS+LQ采样训练方法,每轮采样依照“GS优先,LQ备用”的思路,优先选择训练后能生成最佳分类效果的样本进行标注学习,在GS失效的轮次,LQ作为备用方法选择学习对象。该方法综合了“快”和“稳”的特点,在实现相同精度目标下,相对于单一LQ采样,能进一步压缩训练成本,尤其在分类器初始分类经验匮乏的条件下,该优势体现更为明显。将GS+LQ与ND、MRE等常用采样方法进行实验对比,发现在促使识别能力“由高向更高”演化时,3种方法的差别并不明显,但在推进分类器精度“由低向高”进化时,GS+LQ的效率会高出很多。
Based on DNA microarray data classification, a GS + LQ sampling training method is proposed. Each round of sampling is marked according to the idea of “GS first, LQ standby”, and the sample with the best classification result after training is selected first Learning, at the failure of the GS, LQ as a backup method to select learning objects. The method combines the features of “fast” and “stable”, and can further reduce the training cost compared with a single LQ sample under the same goal of precision. Especially under the condition of lack of initial classification of classifiers, Advantage reflects more obvious. The comparison between GS + LQ and ND, MRE and other commonly used sampling methods shows that there is no obvious difference between the three methods in promoting the recognition ability “from higher to higher ”, From low to high "evolution, GS + LQ efficiency will be much higher.