Word translation disambiguation using bilingual bootstrapping

被引:24
作者
Li, H [1 ]
Li, C [1 ]
机构
[1] Microsoft Res Asia, Sigma Ctr 5F, Beijing, Peoples R China
关键词
D O I
10.1162/089120104773633367
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article proposes a new method for word translation disambiguation, one that uses a machine-learning technique called bilingual bootstrapping. In learning to disambiguate words to be translated, bilingual bootstrapping makes use of a small amount of classified data and a large amount of unclassified data in both the source and the target languages. It repeatedly constructs classifiers in the two languages in parallel and boosts the performance of the classifiers by classifying unclassified data in the two languages and by exchanging information regarding classified data between the two languages. Experimental results indicate that word translation disambiguation based on bilingual bootstrapping consistently and significantly outperforms existing methods that are based on monolingual bootstrapping.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 29 条
[1]  
[Anonymous], P 37 ANN M ASS COMP
[2]  
[Anonymous], 1991, P 29 ANN M ASS COMP, DOI DOI 10.3115/981344.981378
[3]  
[Anonymous], P 2 C EMP METH NAT L
[4]  
[Anonymous], P 2001 C EMP METH NA
[5]   Scaling to very very large corpora for natural language disambiguation [J].
Banko, M ;
Brill, E .
39TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2001, :26-33
[6]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[7]  
BRUCE R, 1994, 32ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P139
[8]  
Collins M., 1999, P 1999 JOINT SIGDAT
[9]  
Dagan I., 1994, Computational Linguistics, V20, P563
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38