Distributed deep learning platform for pedestrian detection on IT convergence environment

被引:5
作者
Han, Seong-Soo [1 ]
Kim, Yoon-Ki [2 ]
Jeon, You-Boo [3 ]
Park, JinSoo [3 ]
Park, Doo-Soon [3 ]
Hwang, DuHyun [2 ]
Jeong, Chang-Sung [2 ]
机构
[1] Korea Univ, Visual Informat Proc, Seoul, South Korea
[2] Korea Univ, Dept Elect Engn, Seoul, South Korea
[3] Soonchunhyang Univ, Dept Comp Software Engn, Asan 31538, Chungcheongnam, South Korea
基金
新加坡国家研究基金会;
关键词
IT convergence; Pedestrian detection; Faster R-CNN; Deep learning; Distribution processing; Parallel processing; Virtual machine;
D O I
10.1007/s11227-020-03195-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
IT technology and traditional industries have been combined recently, resulting in IT convergence technology in various fields. Through convergence with the automobile, pedestrian detection technology, in particular, is used in the autonomous navigation control service of autonomous vehicles and also applied in various fields such as intelligent CCTV and robot recognition technology. For pedestrian detection, hierarchical classification and feature vector were used in early stage, and deep learning is under active progress. However, since deep learning for pedestrian detection is time-consuming for processing a large volume of image data, it requires a lot of computing resources, and hence building such a system is very expensive. Therefore, in this paper we shall present a distributed deep learning platform which can easily build a cluster, and execute deep learning process in the distributed cloud environment, while achieving performance improvement in various ways. Our platform provides a convenient interface for easily and efficiently executing the deep learning process in a distributed environment by providing a multilayered system architecture. Our system builds and utilizes computing power in easy and efficient way by leveraging container technique, so-called OS-level virtualization, rather than traditional hypervisor-based virtualization. In our system, we improve the whole performance by exploiting both of data and parameter parallelisms at once and reduce the synchronization overhead by exploiting asynchronous communication for parameter updates. Also, we propose an efficient resource allocation scheme for parameter servers and slaves which can improve the performance from the experiment.
引用
收藏
页码:5460 / 5485
页数:26
相关论文
共 22 条
  • [1] Face description with local binary patterns:: Application to face recognition
    Ahonen, Timo
    Hadid, Abdenour
    Pietikainen, Matti
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (12) : 2037 - 2041
  • [2] [Anonymous], 2005, PROC CVPR IEEE
  • [3] Geronimo D., 2006, P IASTED INT C VISUA, P400
  • [4] Girshick R. B., 2014, P IEEE C COMPUTER VI, P580, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81]
  • [5] He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]
  • [6] He KL, 2015, IEEE INT CONF ROBOT, P346, DOI 10.1109/ICRA.2015.7139022
  • [7] Hwang D, 2010, CLOUD PLATFORM BASED
  • [8] Joachims T, 2006, P 12 ACM SIGKDD INT, P217, DOI DOI 10.1145/1150402.1150429
  • [9] Kim JJ, 2017, J INF PROCESS SYST, V13, P668, DOI 10.3745/JIPS.04.0036
  • [10] Kim J, 2016, IEEE CONF COMPUT