论文部分内容阅读
哈尔滨工业大学图书馆应用Java语言开发的《图书馆中文查新智能去重系统》,采用基于字段的主题全匹配方法,针对不同数据中采集数据样本可能存在的细微差异,利用句子相似度方法进行衡量,对高相似度结果进行二次校验整理,得到最终去重结果。查新员只需将各个数据库的导出结果导入系统就可轻松实现相同文献的去重,并可按照不同查新站的报告模板导出符合要求的文献格式,从而最大限度地节省查新员处理文献的时间,使得其将有限的查新时间更好地用于文献的对比分析,从而更好地提高查新质量。
Harbin Institute of Technology Library Library, which is developed by Java Language, uses the field-based thematic matching method. According to the possible subtle differences of the collected data samples in different data, a sentence similarity method is used Weighing, the high similarity of the results of the second check finishing, get the final result. Check the new members simply export the results of the various databases into the system can easily achieve the same document deduplication, and in accordance with different search report templates new station to export the file format to meet the requirements in order to maximize the search brokers to save the processing literature Of the time, making it a better search time for better use of the literature comparative analysis, so as to better improve search quality.