Using metrics from complex networks to evaluate machine translation

被引:37
作者
Amancio, D. R. [1 ]
Nunes, M. G. V. [2 ]
Oliveira, O. N., Jr. [1 ]
Pardo, T. A. S. [2 ]
Antiqueira, L. [1 ]
Costa, L. da F. [1 ]
机构
[1] Univ Sao Paulo, Inst Phys Sao Carlos, BR-13560970 Sao Paulo, Brazil
[2] Univ Sao Paulo, Inst Math & Comp Sci, BR-13560970 Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
Machine translation; Evaluation; Complex networks; Machine learning;
D O I
10.1016/j.physa.2010.08.052
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:131 / 142
页数:12
相关论文
共 51 条
[1]   Complex networks analysis of manual and machine translations [J].
Amancio, Diego R. ;
Antiqueira, Lucas ;
Pardo, Thiago A. S. ;
Costa, Luciano da F. ;
Oliveira, Osvaldo N., Jr. ;
Nunes, Maria G. V. .
INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2008, 19 (04) :583-598
[2]  
[Anonymous], 2005, M ASS COMP LING
[3]  
[Anonymous], P EMP METH NAT LANG
[4]  
[Anonymous], 2014, C4. 5: programs for machine learning
[5]  
[Anonymous], 2006, PRINCETON STUDIES CO
[6]   Strong correlations between text quality and complex networks features [J].
Antiqueira, L. ;
Nunes, M. G. V. ;
Oliveira, O. N., Jr. ;
Costa, L. da F. .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2007, 373 :811-820
[7]  
ANTIQUEIRA L, 2007, REV IBEROAM INTEL AR, V11, P51
[8]   A complex network approach to text summarization [J].
Antiqueira, Lucas ;
Oliveira, Osvaldo N., Jr. ;
Costa, Luciano da Fontoura ;
Volpe Nunes, Maria das Gracas .
INFORMATION SCIENCES, 2009, 179 (05) :584-599
[9]  
Armentano-Oller C, 2006, LECT NOTES ARTIF INT, V3960, P50
[10]  
Balakrishnan R, 2012, UNIVERSITEX