Synthetic Data for Text Localisation in Natural Images

被引:947
作者
Gupta, Ankush [1 ]
Vedaldi, Andrea [1 ]
Zisserman, Andrew [1 ]
机构
[1] Univ Oxford, Dept Engn Sci, Oxford, England
来源
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/CVPR.2016.254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU.
引用
收藏
页码:2315 / 2324
页数:10
相关论文
共 48 条
  • [1] [Anonymous], P ICCV
  • [2] [Anonymous], 2012, P ECCV
  • [3] [Anonymous], ICCV WORKSH CONS DEP
  • [4] [Anonymous], P ACM SIGGRAPH
  • [5] [Anonymous], P ECCV
  • [6] [Anonymous], ARXIV E PRINTS
  • [7] [Anonymous], 2015, P CVPR
  • [8] [Anonymous], 2010, P ECCV
  • [9] [Anonymous], 2012, P BMVC
  • [10] [Anonymous], 2016, NIPS