Collaborative Recognition and Recovery of the Chinese Intercept Abbreviation

来源 :第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会 | 被引量 : 0次 | 上传用户:wjh_1201
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  One of the important works of Information Content Security is eval-uating the theme words of the text.Because of the variety of the Chinese ex-pression,especially of the abbreviation,the supervision of the theme words be-comes harder.The goal of this paper is to quickly and accurately discover the intercept abbreviations from the text crawled at the short time period.The paper firstly segments the target texts,and then utilizes the Supported Vector Machine(SVM)to recognize the abbreviations from the wrongly segmented texts as the candidates.Secondly,this paper presents the collaborative methods: Improve the Conditional Random Fields(CRF)to predict the corresponding word to each character of the abbreviation; To solve the problems of the 1:n relation-ship,collaboratively merge the ranking list from the predict steps with the matched results of the thesaurus of abbreviations.The experiments demonstrate that our method at the recognizing stage is 76.5%of the accuracy and 77.8%of the recall rate.At the recovery step,the accuracy is 62.1%,which is 20.8%higher than the method based on Hidden Markov Model(HMM).
其他文献
This paper investigates relations between word semantic den-sity and word frequency.A distributed representations based word av-erage similarity is defined as the measure of word semantic density.We f
This paper presents a character-level encoder-decoder mod-eling method for question answering(QA)from large-scale knowledge bases(KB).This method improves the existing approach [9] from three aspects.
Deep networks have been widely used in many domains in recentyears.However,the pre-training of deep networks is time consuming with greedy layer-wise algorithm,and the scalability of this algorithm is
Current research on emotion detection focuses on the recognizingexplicit emotion expressions in text.In this paper,we propose an approach based on textual inference to detect implicit emotion expressi
We have presented a simple algorithm to noun phrases interpretation based on hand-crafted knowledge-base containing detailed semantic information.The main idea is to define a set of relations that can
Neural attention-based models have been widely used recently in head-line generation by mapping source document to target headline.However,the traditional neural headline generation models utilize the
As a minority language,Tibetan has received relatively little atten-tion in the field of natural language processing(NLP),especially in current var-ious neural network models.In this paper,we investig
会议
Collocation Extraction plays an important role in machine transla-tion,information retrieval,secondary language learning,etc.,and has obtained significant achievements in other languages,e.g.English a
Recently,image caption which aims to generate a textual description for an image automatically has attracted researchers from various fields.Encouraging performance has been achieved by applying deep
This paper puts forward theme analysis problem in order to automatically solve composition writing questions in Chinese college entrance examination.Theme analysis is to distillate the embedded se-man