论文部分内容阅读
文章结合蒙古文的独特性,研究蒙古文信息检索系统。首先搭建一个用于评价检索性能的蒙古文文档测试集,建立一套蒙古文信息检索系统。实验对比分析检索模型、平滑算法、蒙古文停用词表、词干还原和伪相关反馈等技术对蒙古文信息检索系统关键技术对检索性能的影响。实验结果表明,蒙古文信息检索系统选择结构化语言模型、Dirichlet平滑方法、停用词表、以词根做检索单元和伪相关反馈可以更好地提升检索性能。
Combining the uniqueness of Mongolian language, the article studies Mongolian information retrieval system. First set up a Mongolian document test set used to evaluate the retrieval performance and establish a set of Mongolian information retrieval system. Experimental comparative analysis of the retrieval model, smoothing algorithm, stopwatch in Mongolian, stemming reduction and pseudo-correlation feedback and other key technologies on Mongolian information retrieval system retrieval performance. The experimental results show that the Mongolian information retrieval system can improve the retrieval performance by choosing structured language model, Dirichlet smoothing method, stop-use vocabulary, stem-based retrieval unit and pseudo-correlation feedback.