Computer recognition of printed Bangla script

被引：9

作者：

Pal, U. ^{[1
]}

Chaudhuri, B.B. ^{[1
]}

机构：

[1] Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B. T. Road, Calcutta,700 035, India

来源：

International Journal of Systems Science | 1995年 / 26卷 / 11期

关键词：

Template matching - Image segmentation;

D O I：

10.1080/00207729508929157

中图分类号：

学科分类号：

摘要：

This paper considers optical character recognition (OCR) of Bangla, the second most popular script in the Indian subcontinent. A complete OCR system is described for documents of single Bangla font, where more than three hundred character shapes are recognized by a combination of template and feature-matching approach. Here the document image captured by a flatbed scanner is subject to tilt correction, line, word and character segmentation, simple and compound character separation, feature extraction and finally character recognition. Some character occurrence statistics have been computed to aid the recognition process. The simple character recognition is done by a feature-based tree classifier, and the compound character recognition involves a template matching approach preceded by a feature-based grouping. At present, recognition accuracy of about 96% is obtained by the system. © 1995, Copyright Taylor & Francis Group, LLC.

引用

页码：2107 / 2123