论文部分内容阅读
Obtaining bilingual parallel data from the multilingual websites is along-standing research problem,which is very benefit for resource-scarce lan-guages.In this paper,we present an approach for obtaining parallel data based on word embedding,and our model only rely on a small scale of bilingual lexi-con.Our approach benefit from the recent advances of continuous word repre-sentations,which can reveal more context information compared with tradition-al methods.Our experiments show that high-precision and sizable parallel Uy-ghur-Chinese data can be obtained for lacking bilingual lexicon.