Gradient-based learning applied to document recognition

被引：35695

作者：

Lecun, Y ^{[1
]}

Bottou, L

Bengio, Y

Haffner, P

机构：

[1] AT&T Bell Labs, Res, Speech & Image Proc Serv Res Lab, Red Bank, NJ 07701 USA

[2] Univ Montreal, Dept Informat & Rech Operat, Montreal, PQ H3C 3J7, Canada

来源：

PROCEEDINGS OF THE IEEE | 1998年 / 86卷 / 11期

关键词：

convolutional neural networks; document recognition; finite state transducers; gradient-based learning; graph transformer networks; machine learning; neural networks; optical character recognition (OCR);

D O I：

10.1109/5.726791

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient-based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, arch as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of two dimensional (2-D) shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition and language modeling. A new learning paradigm, called graph transformer networks (GTN's), allows such multimodule systems to be trained globally using gradient-based methods so; as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank check is also described It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal checks. It is deployed commercially and reads several million checks per day.

引用

页码：2278 / 2324

页数：47

共 121 条

[1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2] A THEORY OF ADAPTIVE PATTERN CLASSIFIERS
AMARI, S
[J]. IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (03): : 299 - +
[3] [Anonymous], 1992, ADV NEUR IN
[4] [Anonymous], 1993, ADV NEURAL INF PROCE
[5] [Anonymous], 1986, NUMERICAL RECIPES C
[6] Bahl L., 1986, INT C ACOUSTICS SPEE, P49
[7] Bahl L. R., 1987, Computer Speech and Language, V2, P219, DOI 10.1016/0885-2308(87)90010-6
[8] 1ST-ORDER AND 2ND-ORDER METHODS FOR LEARNING - BETWEEN STEEPEST DESCENT AND NEWTON METHOD
BATTITI, R
[J]. NEURAL COMPUTATION, 1992, 4 (02) : 141 - 166
[9] BECKER S, 1988, CRGTR885 U TOR
[10] REAL-SPACE EVIDENCE FOR REVERSIBLE METAL-METAL BOND REARRANGEMENT INDUCED BY AFM TIP FORCE
BENGEL, H
CANTOW, HJ
MAGONOV, SN
WHANGBO, MH
[J]. ADVANCED MATERIALS, 1995, 7 (05) : 483 - 486

← 1 2 3 4 5 6 7 8 9 10 →