Face detection in the operating room: comparison of state-of-the-art methods and a self-supervised approach

被引:8
作者
Issenhuth, Thibaut [1 ]
Srivastav, Vinkle [1 ]
Gangi, Afshin [2 ]
Padoy, Nicolas [1 ]
机构
[1] Univ Strasbourg, CNRS, ICube, IHU Strasbourg, Strasbourg, France
[2] Univ Hosp Strasbourg, Dept Radiol, Strasbourg, France
关键词
Face detection; Semi-supervised learning; MVOR-Faces dataset; Visual domain adaptation; Operating room;
D O I
10.1007/s11548-019-01944-y
中图分类号
R318 [生物医学工程];
学科分类号
100103 [病原生物学];
摘要
PurposeFace detection is a needed component for the automatic analysis and assistance of human activities during surgical procedures. Efficient face detection algorithms can indeed help to detect and identify the persons present in the room and also be used to automatically anonymize the data. However, current algorithms trained on natural images do not generalize well to the operating room (OR) images. In this work, we provide a comparison of state-of-the-art face detectors on OR data and also present an approach to train a face detector for the OR by exploiting non-annotated OR images.MethodsWe propose a comparison of six state-of-the-art face detectors on clinical data using multi-view OR faces, a dataset of OR images capturing real surgical activities. We then propose to use self-supervision, a domain adaptation method, for the task of face detection in the OR. The approach makes use of non-annotated images to fine-tune a state-of-the-art detector for the OR without using any human supervision.ResultsThe results show that the best model, namely the tiny face detector, yields an average precision of 0.556 at intersection over union of 0.5. Our self-supervised model using non-annotated clinical data outperforms this result by 9.2%.ConclusionWe present the first comparison of state-of-the-art face detectors on OR images and show that results can be significantly improved by using self-supervision on non-annotated data.
引用
收藏
页码:1049 / 1058
页数:10
相关论文
共 28 条
[1]
2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[2]
Parsing human skeletons in an operating room [J].
Belagiannis, Vasileios ;
Wang, Xinchao ;
Ben Shitrit, Horesh Beny ;
Hashimoto, Kiyoshi ;
Stauder, Ralf ;
Aoki, Yoshimitsu ;
Kranzfelder, Michael ;
Schneider, Armin ;
Fua, Pascal ;
Ilic, Slobodan ;
Feussner, Hubertus ;
Navab, Nassir .
MACHINE VISION AND APPLICATIONS, 2016, 27 (07) :1035-1046
[3]
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[4]
Patient-Specific Pose Estimation in Clinical Environments [J].
Chen, Kenny ;
Gabriel, Paolo ;
Alasfour, Abdulwahab ;
Gong, Chenghao ;
Doyle, Werner K. ;
Devinsky, Orrin ;
Friedman, Daniel ;
Dugan, Patricia ;
Melloni, Lucia ;
Thesen, Thomas ;
Gonda, David ;
Sattar, Shifteh ;
Wang, Sonya ;
Gilja, Vikash .
IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE, 2018, 6
[5]
Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[6]
RMPE: Regional Multi-Person Pose Estimation [J].
Fang, Hao-Shu ;
Xie, Shuqin ;
Tai, Yu-Wing ;
Lu, Cewu .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2353-2362
[7]
FaceOff: Anonymizing Videos in the Operating Rooms [J].
Flouty, Evangello ;
Zisimopoulos, Odysseas ;
Stoyanov, Danail .
OR 2.0 CONTEXT-AWARE OPERATING THEATERS, COMPUTER ASSISTED ROBOTIC ENDOSCOPY, CLINICAL IMAGE-BASED PROCEDURES, AND SKIN IMAGE ANALYSIS, OR 2.0 2018, 2018, 11041 :30-38
[8]
Additive logistic regression: A statistical view of boosting - Rejoinder [J].
Friedman, J ;
Hastie, T ;
Tibshirani, R .
ANNALS OF STATISTICS, 2000, 28 (02) :400-407
[9]
Deep Level Sets for Salient Object Detection [J].
Hu, Ping ;
Shuai, Bing ;
Liu, Jun ;
Wang, Gang .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :540-549
[10]
DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model [J].
Insafutdinov, Eldar ;
Pishchulin, Leonid ;
Andres, Bjoern ;
Andriluka, Mykhaylo ;
Schiele, Bernt .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :34-50