Term Translation Extraction from Historical Classics Using Modern Chinese Explanation

来源 :第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018) | 被引量 : 0次 | 上传用户：zhengafei1

【摘要】

：

Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical

【作者】

：

Xiaoting Wu Hanyu Zhao Chao Che

【机构】

：

Dalian University,Key Laboratory of Advanced Design and Intelligent Computing,Ministry of Education,

【出处】

：

第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018)

【发表日期】

：

2018年9期

【关键词】

：

BiLSTM-CRF Co-occurrence frequency Transliteration features Term translation ext

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical classics.However,it is tough to recognize the terms directly from ancient Chinese due to the flexible syntactic of ancient Chinese and the word segmentation errors of ancient Chinese will lead to more errors in term translation extraction.Considering most of the terms in ancient Chinese are still reserved in modern Chinese and the terms in modern Chinese are more easily to be identified,we propose a term translation extract-ing method using multi-features based on character-based model to extract his-torical term translation pairs from modern Chinese-English corpora instead of ancient Chinese-English corpora.Specifically,we first employ character-based BiLSTM-CRF model to identify historical terms in modern Chinese without word segmentation,which avoids word segmentation error spreading to the term alignment.Then we extract English terms according to initial capitaliza-tion rules.At last,we align the English and Chinese terms based on co-occurrence frequency and transliteration feature.The experiment on Shiji demonstrates that the performance of the proposed method is far superior to the traditional method,which confirms the effectiveness of using modern Chinese as a substitute.

其他文献

Identifying Word Translations in Scientific Literature based on Labeled Bilingual Topic Model and Co

Aiming at the increasingly rich multi language information resources and multi-label data in scientific literature,in order to mining the relevance and correlation in languages,this paper proposed the

会议

Topic ModelLabelCo-occurrence FeaturesWord Translations

An End-to-End Entity and Relation Extraction Network with Multi-head Attention

Relation extraction is an important semantic processing task in natu-ral language processing.The state-of-the-art systems usually rely on elaborately designed features,which are usually time-consuming

会议

Relation ExtractionEnd-to-End Joint ExtractionNamed Entity Recognition

Learning to Detect Verbose Expressions in Spoken Texts

The analysis and understanding of spoken texts is an impor-tant task in artificial intelligence and natural language processing.How-ever,there are many verbose expressions(such as mantras,nonsense,mod

会议

Trigger Words Detection by Integrating Attention Mechanism into Bi-LSTM Neural Network | A Case stud

A Bi-LSTM based encode/decode mechanism for named entity recognition was studied in this research.In the proposed mechanism,Bi-LSTM was used for encoding,an Attention method was used in the intermedia

会议

natural language processingLSTMencoder/decoder modeltrigger words

Attention-Based Convolutional Neural Networks for Chinese Relation Extraction

Relation extraction is an important part of many information extrac-tion systems that mines structured facts from texts.Recently,deep learning has achieved good results in relation extraction.Attentio

会议

Relation ExtractionConvolutional Neural NetworksAttention Mechanism

A Hierarchical Hybrid Neural Network Architecture for Chinese Text Summarization

Using sequence-to-sequence models for abstractive text sum-marization is generally plagued by three problems: inability to deal with out-of-vocabulary words,repetition in summaries and time-consuming

会议

Abstractive Text SummarizationHierarchical Attention MechanismPointer Mechanis

TSABCNN:Two-Stage Attention-Based Convolutional Neural Network for Frame Identification

As an essential sub-task of frame-semantic parsing,Frame Identifica-tion(FI)is a fundamentally important research topic in shallow semantic pars-ing.However,most existing work is based on sophisticate

会议

Frame IdentificationFrameNetConvolutional Neural Network

Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings

The evaluation of word embeddings has received a considerable amount of attention in recent years,but there have been some debates about whether intrinsic measures can predict the performance of downs

会议

word embeddingintrinsic evaluationextrinsic evaluation

End-to-end Task-Oriented Dialogue System with Distantly Supervised Knowledge Base Retriever

Task-oriented dialog systems usually face the challenge of querying knowledge base.However,it usually cannot be explicitly modeled due to the lack of annotation.In this paper,we introduce an explicit

会议

task-oriented dialog systemssequence-to-sequenceKnowledge Base

Research on Chinese-Tibetan Neural Machine Translation

At present,the research on Tibetan machine translation is mainly fo-cused on Tibetan-Chinese machine translation and the research on Chinese-Tibetan machine translation is almost blank.In this paper,t

会议

Neural Machine TranslationTibetanSyntactic treeAttention

Term Translation Extraction from Historical Classics Using Modern Chinese Explanation

其他学术论文