Efficient attention-based deep encoder and decoder for automatic crack segmentation

被引：194

作者：

Kang, Dong H. ^{[1
]}

Cha, Young-Jin ^{[1
]}

机构：

[1] Univ Manitoba, Winnipeg, MB, Canada

来源：

STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL | 2022年 / 21卷 / 05期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Image segmentation; image analysis; concrete crack segmentation; image synthesis; pixel-level classification; real-time processing; computer vision; damage detection; deep learning; semantic segmentation; 3D ASPHALT SURFACES; DAMAGE DETECTION; NEURAL-NETWORKS;

D O I：

10.1177/14759217211053776

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Recently, crack segmentation studies have been investigated using deep convolutional neural networks. However, significant deficiencies remain in the preparation of ground truth data, consideration of complex scenes, development of an object-specific network for crack segmentation, and use of an evaluation method, among other issues. In this paper, a novel semantic transformer representation network (STRNet) is developed for crack segmentation at the pixel level in complex scenes in a real-time manner. STRNet is composed of a squeeze and excitation attention-based encoder, a multi head attention-based decoder, coarse upsampling, a focal-Tversky loss function, and a learnable swish activation function to design the network concisely by keeping its fast-processing speed. A method for evaluating the level of complexity of image scenes was also proposed. The proposed network is trained with 1203 images with further extensive synthesis-based augmentation, and it is investigated with 545 testing images (1280 x 720, 1024 x 512); it achieves 91.7%, 92.7%, 92.2%, and 92.6% in terms of precision, recall, F1 score, and mIoU (mean intersection over union), respectively. Its performance is compared with those of recently developed advanced networks (Attention U-net, CrackSegNet, Deeplab V3+, FPHBN, and Unet++), with STRNet showing the best performance in the evaluation metrics-it achieves the fastest processing at 49.2 frames per second.

引用

页码：2190 / 2205

页数：16

共 60 条

[1] Abraham N, 2019, I S BIOMED IMAGING, P683, DOI 10.1109/ISBI.2019.8759329
[2] Anderson A., 2017, ARXIV PREPRINT ARXIV
[3] Semantic Segmentation of Satellite Images using a Modified CNN with Hard-Swish Activation Function
Avenash, R.
Viswanath, P.
[J]. VISAPP: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 4, 2019, : 413 - 420
[4] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Badrinarayanan, Vijay
Kendall, Alex
Cipolla, Roberto
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
[5] Encoder-decoder network for pixel-level road crack detection in black-box images
Bang, Seongdeok
Park, Somin
Kim, Hongjo
Kim, Hyoungkwan
[J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2019, 34 (08) : 713 - 727
[6] Deep learning-based automatic volumetric damage quantification using depth camera
Beckman, Gustavo H.
Polyzois, Dimos
Cha, Young-Jin
[J]. AUTOMATION IN CONSTRUCTION, 2019, 99 : 114 - 124
[7] Crack Segmentation on UAS-based Imagery using Transfer Learning
Benz, Christian
Debus, Paul
Ha, Huy Khanh
Rodehorst, Volker
[J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2019,
[8] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
[9] Albumentations: Fast and Flexible Image Augmentations
Buslaev, Alexander
Iglovikov, Vladimir I.
Khvedchenya, Eugene
Parinov, Alex
Druzhinin, Mikhail
Kalinin, Alexandr A.
[J]. INFORMATION, 2020, 11 (02)
[10] Autonomous concrete crack detection using deep fully convolutional neural network
Cao Vu Dung
Le Duc Anh
[J]. AUTOMATION IN CONSTRUCTION, 2019, 99 : 52 - 58

← 1 2 3 4 5 6 →