论文部分内容阅读
本文详细介绍了ASPEC(亚洲科学论文摘录语料库)。作为首个大规模的科学论文领域内的平行语料库,ASPEC是由日-中机器翻译项目于2006年至2010年间利用科技促进专用协作基金构建起来的。它包含约300万条平行语句的日-英科学论文摘要语料库(ASPEC-JE)和约68万条平行语句的中-日科学论文摘录语料库(ASPEC-JC)。ASPEC被用作机器翻译评测研讨会WAT(亚洲翻译研讨会)的官方数据集。
This article details ASPEC (Asian Science Papers Extract Corpus). As the first parallel corpus in the field of scientific dissertations, ASPEC was built by a Japanese-Chinese machine translation project from 2006 to 2010 using a dedicated collaborative fund for science and technology promotion. It contains about 3 million parallel sentences of Japanese-English scientific abstract corpus (ASPEC-JE) and about 680,000 parallel sentences of Chinese-Japanese scientific papers excerpt corpus (ASPEC-JC). ASPEC was used as the official dataset for the Machine Translation Evaluation Workshop WAT (Asia Translation Symposium).