Why table ground-truthing is hard

被引：17

作者：

Hu, JY ^{[1
]}

Kashi, R ^{[1
]}

Lopresti, D ^{[1
]}

Nagy, G ^{[1
]}

Wilfong, G ^{[1
]}

机构：

[1] Avaya Inc, Avaya Labs, Murray Hill, NJ 07974 USA

来源：

SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS | 2001年

关键词：

D O I：

10.1109/ICDAR.2001.953768

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The principle that for every document analysis task there exists a mechanism for creating well-de, fined ground-truth is a widely held tenet. Past experience with standard datasets providing ground-truth for character recognition and page segmentation tasks supports this belief. In the process of attempting to evaluate several table recognition algorithms we have been developing, however we have uncovered a number of serious hurdles connected with the ground-truthing of tables. This problem map, in fact, be much snore difficult than it appears. We present a detailed analysis of why table ground-truthing is so hard, including the notions that there may exist more than one acceptable "truth" and/or incomplete or partial "truths.".

引用

页码：129 / 133

页数：3