论文部分内容阅读
Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical classics.However,it is tough to recognize the terms directly from ancient Chinese due to the flexible syntactic of ancient Chinese and the word segmentation errors of ancient Chinese will lead to more errors in term translation extraction.Considering most of the terms in ancient Chinese are still reserved in modern Chinese and the terms in modern Chinese are more easily to be identified,we propose a term translation extract-ing method using multi-features based on character-based model to extract his-torical term translation pairs from modern Chinese-English corpora instead of ancient Chinese-English corpora.Specifically,we first employ character-based BiLSTM-CRF model to identify historical terms in modern Chinese without word segmentation,which avoids word segmentation error spreading to the term alignment.Then we extract English terms according to initial capitaliza-tion rules.At last,we align the English and Chinese terms based on co-occurrence frequency and transliteration feature.The experiment on Shiji demonstrates that the performance of the proposed method is far superior to the traditional method,which confirms the effectiveness of using modern Chinese as a substitute.