论文部分内容阅读
MicroRNAs(miRNAs)是一类长约22个核苷酸序列(nt)的非编码单链小RNA,通过与靶标mRNA 3′-UTR完全或不完全碱基互补匹配,在后转录时期发挥重要的基因调控功能。正确识别miRNA阳性靶标已成为研究miRNA功能的瓶颈。针对miRNA靶标预测的高维、非线性小样本数据集,本文基于υ-SVM,提出冗余特征剔除算法,并将模式识别和特征选择融合在一起,通过冗余特征剔除算法优化特征集,构造表征miRNA与靶标作用模型的最佳特征组合,先验参数γ(0<υ≤1)控制数据集压缩比例,选取有区分力度的支持向量集,进而构建最佳预测性能的miRNA靶标分类器模型。采用独立测试集对miRNA靶标预测模型进行无偏差性能评估。试验表明,文中靶标预测的分类识别和泛化性能比算法miTarget、NBmiRTar及TargetMiner等更佳。
MicroRNAs (miRNAs) are a class of non-coding single-stranded small RNAs of about 22 nucleotides in length (nt) that play important roles in the post-transcriptional phase through complete or incomplete base-complementing of the 3’-UTR of the target mRNA Gene regulation function. The correct identification of miRNA-positive targets has become a bottleneck in the study of miRNA function. According to the high-dimensional and nonlinear small sample data set predicted by miRNA target, this paper proposes a redundant feature extraction algorithm based on υ-SVM, combines the pattern recognition and feature selection, and optimizes the feature set by redundancy feature elimination algorithm. The best feature combination of miRNA and target model was characterized. The prior parameter γ (0 <υ≤1) was used to control the compression ratio of the data set and the support vector set with discriminate strength was chosen to construct the miRNA target classifier model with the best predictive performance . Unbiased performance evaluation of miRNA target prediction models using independent test sets. Experiments show that the classification and generalization performance of the target prediction in the paper are better than the algorithms miTarget, NBmiRTar and TargetMiner.