论文部分内容阅读
We introduce a monolingual query method with additional webpage data to improve the translation quality for more and more official use requirement of statistical machine translation outputs.The motivation behind this method is that we can improve the readability of sentence once for all if we replace translation sentences with the most related sentences generated by human.Based on vector space representations for translated sentences,we perform a query on search engine for additional reference text data.Then we rank all translation sentences to make necessary replacement from the query results.Various vector representations for sentence,TFIDF,latent semantic indexing,and neural network word embedding,are conducted and the experimental results show an alternative solution to enhance the current machine translation with a performance improvement about 0.5 BLEU in French-to-English task and 0.7 BLEU in English-to-Chinese task.