Computer recognition of printed Bangla script

被引:9
作者
Pal, U. [1 ]
Chaudhuri, B.B. [1 ]
机构
[1] Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B. T. Road, Calcutta,700 035, India
关键词
Template matching - Image segmentation;
D O I
10.1080/00207729508929157
中图分类号
学科分类号
摘要
This paper considers optical character recognition (OCR) of Bangla, the second most popular script in the Indian subcontinent. A complete OCR system is described for documents of single Bangla font, where more than three hundred character shapes are recognized by a combination of template and feature-matching approach. Here the document image captured by a flatbed scanner is subject to tilt correction, line, word and character segmentation, simple and compound character separation, feature extraction and finally character recognition. Some character occurrence statistics have been computed to aid the recognition process. The simple character recognition is done by a feature-based tree classifier, and the compound character recognition involves a template matching approach preceded by a feature-based grouping. At present, recognition accuracy of about 96% is obtained by the system. © 1995, Copyright Taylor & Francis Group, LLC.
引用
收藏
页码:2107 / 2123
相关论文
empty
未找到相关数据