Term Translation Extraction from Historical Classics Using Modern Chinese Explanation

来源 :第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018) | 被引量 : 0次 | 上传用户:zhengafei1
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical classics.However,it is tough to recognize the terms directly from ancient Chinese due to the flexible syntactic of ancient Chinese and the word segmentation errors of ancient Chinese will lead to more errors in term translation extraction.Considering most of the terms in ancient Chinese are still reserved in modern Chinese and the terms in modern Chinese are more easily to be identified,we propose a term translation extract-ing method using multi-features based on character-based model to extract his-torical term translation pairs from modern Chinese-English corpora instead of ancient Chinese-English corpora.Specifically,we first employ character-based BiLSTM-CRF model to identify historical terms in modern Chinese without word segmentation,which avoids word segmentation error spreading to the term alignment.Then we extract English terms according to initial capitaliza-tion rules.At last,we align the English and Chinese terms based on co-occurrence frequency and transliteration feature.The experiment on Shiji demonstrates that the performance of the proposed method is far superior to the traditional method,which confirms the effectiveness of using modern Chinese as a substitute.
其他文献
Aiming at the increasingly rich multi language information resources and multi-label data in scientific literature,in order to mining the relevance and correlation in languages,this paper proposed the
Relation extraction is an important semantic processing task in natu-ral language processing.The state-of-the-art systems usually rely on elaborately designed features,which are usually time-consuming
The analysis and understanding of spoken texts is an impor-tant task in artificial intelligence and natural language processing.How-ever,there are many verbose expressions(such as mantras,nonsense,mod
会议
A Bi-LSTM based encode/decode mechanism for named entity recognition was studied in this research.In the proposed mechanism,Bi-LSTM was used for encoding,an Attention method was used in the intermedia
Relation extraction is an important part of many information extrac-tion systems that mines structured facts from texts.Recently,deep learning has achieved good results in relation extraction.Attentio
Using sequence-to-sequence models for abstractive text sum-marization is generally plagued by three problems: inability to deal with out-of-vocabulary words,repetition in summaries and time-consuming
As an essential sub-task of frame-semantic parsing,Frame Identifica-tion(FI)is a fundamentally important research topic in shallow semantic pars-ing.However,most existing work is based on sophisticate
The evaluation of word embeddings has received a considerable amount of attention in recent years,but there have been some debates about whether intrinsic measures can predict the performance of downs
Task-oriented dialog systems usually face the challenge of querying knowledge base.However,it usually cannot be explicitly modeled due to the lack of annotation.In this paper,we introduce an explicit
At present,the research on Tibetan machine translation is mainly fo-cused on Tibetan-Chinese machine translation and the research on Chinese-Tibetan machine translation is almost blank.In this paper,t