论文部分内容阅读
【目的】为科技期刊自动提取更加全面的元数据提供方法和借鉴。【方法】以方正排版文件为对象,建立了提取元数据的数学模型,同时提出尾部分割算法。然后利用基于对象的VB编程软件编写了自动提取元数据程序。【结果】在分析了方正排版语言特点之后,对方正排版文件进行了字符串替换处理,并建立了分割关键词列表文件,最后将提取的元数据保存到Excel文件中。【结论】实际应用表明,仅几秒钟就可以完成一期数据的提取工作,大大提高了工作效率。
【Objective】 To provide methods and references for sci-tech periodicals to automatically extract more comprehensive metadata. 【Method】 Based on the Founder typesetting documents, a mathematical model of metadata extraction was established. At the same time, tail segmentation algorithm was proposed. Then use the object-based VB programming software to prepare automatic extraction metadata program. 【Result】 After analyzing the characteristics of Founder typesetting language, the Founder typesetting file was replaced by string, and the key word list file was established. Finally, the extracted metadata was saved in Excel file. 【Conclusion】 The practical application shows that only a few seconds to complete the extraction of a data, greatly improving work efficiency.