基于轻型卷积神经网络的火焰检测方法

doi:10.6040/j.issn.1672-3961.0.2019.424

山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (2): 100-107.doi: 10.6040/j.issn.1672-3961.0.2019.424

基于轻型卷积神经网络的火焰检测方法

严云洋^1,^2,³(),杜晨锡^1,²,刘以安²,高尚兵¹

1. 淮阴工学院计算机与软件工程学院, 江苏淮安 223003
2. 江南大学物联网工程学院, 江苏无锡 214122
3. 江苏海洋大学计算机工程学院, 江苏连云港 222005

收稿日期:2019-07-25 出版日期:2020-04-20 发布日期:2020-04-16
作者简介:严云洋(1967—)，男，江苏淮安人，教授，博士，CCF会员，主要研究方向为数字图像处理,模式识别. E-mail：yunyang@hyit.edu.cn
基金资助:
国家自然科学基金资助项目(61402192);江苏省“六大人才高峰”项目(2013DZXX-023);江苏省“青蓝工程”;淮安市“533英才工程”

Fire detection based on lightweight convolutional neural network

Yunyang YAN^1,^2,³(),Chenxi DU^1,²,Yian LIU²,Shangbing GAO¹

1. Faculty of Computer & Software Engineering, Huaiyin Institute of Technology, Huaian 223003, Jiangsu, China
2. School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China
3. School of Computer Engineering, Jiangsu Ocean University, Lianyungang 222005, Jiangsu, China

Received:2019-07-25 Online:2020-04-20 Published:2020-04-16
Supported by:
国家自然科学基金资助项目(61402192);江苏省“六大人才高峰”项目(2013DZXX-023);江苏省“青蓝工程”;淮安市“533英才工程”

摘要/Abstract

摘要：

提出一种基于MobileNet的轻型火焰检测方法,基于深度分离卷积和膨胀卷积的膨胀卷积模块(dilated convolution block, DCB)扩增特征的感受野,加强特征语义信息,提高了视频火焰目标的检测率;优化SSD(Single Shot Multibox Detector)检测框架,提出了一种轻型的检测模型DMSSD(Dilated MobileNet-SSD)。在PASCAL VOC数据集和Bilkent大学VisiFire数据集上进行火焰检测试验,试验结果表明火焰检测的平均精度均值分别提升了1.7%和3.8%,火焰检测速度也可达80帧/s,具有较强的鲁棒性和实用性。

关键词: 火焰检测, MobileNet, 膨胀卷积, 通道重排, DCB

Abstract:

A novel lightweight flame detection method was proposed based on MobileNet. The video flame detection rate was promoted by the feature receptive field of DCB(dilated convolution block)module expand based on depthwise separable convolution and dilated convolution to strengthen the feature semantic information. The SSD(single shot multibox detector) detection framework was also optimized. The lightweight detection model DMSSD(Dilated MobileNet-SSD) was provided. Experiments showed that the mean average precision was increased by 1.7% and 3.8% respectively on the PASCAL VOC dataset and the VisiFire dataset of Bilkent University. Furthermore, the detection speed was up to 80 frames per second. The robustness and real-time performance of DMSSD were strong.

Key words: fire detection, MobileNet, dilated convolution, channel shuffle, DCB

中图分类号:

TP391

严云洋,杜晨锡,刘以安,高尚兵. 基于轻型卷积神经网络的火焰检测方法[J]. 山东大学学报 (工学版), 2020, 50(2): 100-107.

Yunyang YAN,Chenxi DU,Yian LIU,Shangbing GAO. Fire detection based on lightweight convolutional neural network[J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 100-107.

图/表 17

图1

图2

表1

图3

图4

图5

图6

图7

表2

表3

表4

表5

表6

图8

图9

表7

表8

参考文献 20

1	HOWARD A G, ZHU Menglong, CHEN Bo, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv: 1704.04861, 2017. https://arxiv.org/abs/1704.04861.
2	CHENG Yu, WANG Duo, ZHOU Pan, et al. A survey of model compression and acceleration for deep neural networks[J]. arXiv preprint arXiv: 1710.09282, 2017. https://arxiv.org/abs/1710.09282.
3	HAN Song, MAO Huizi, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding[J]. arXiv preprint arXiv: 1510.00149, 2015. https://arxiv.org/abs/1510.00149.
4	LUO Ping, ZHU Zhenyao, LIU Ziwei, et al. Face model compression by distilling knowledge from neurons[C]//Proceedings of 30th AAAI Conference on Artificial Intelligence. Menlo Park, USA: Association for the Advancement of Artificial Intelligence, 2016: 3560-3566.
5	JADERBERG M, VEDALDI A, ZISSERMAN A. Speeding up convolutional neural networks with low rank expansions[J]. arXiv preprint arXiv: 1405.3866, 2014. https://arxiv.org/abs/1405.3866.
6	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Computer Society, 2016: 2818-2826.
7	CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Computer Society, 2017: 1800-1807.
8	ZHANG Xiangyu, ZHOU Xinyu, LIN Mengxiao, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Computer Society, 2018: 6848-6856.
9	CHEN Liangchieh, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. arXiv preprint arXiv: 1706.05587, 2017. https://arxiv.org/abs/1706.05587.
10	DUMOULIN V, VISIN F. A guide to convolution arithmetic for deep learning[J]. arXiv preprint arXiv: 1603.07285, 2016. https://arxiv.org/abs/1603.07285.
11	WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Computer Society, 2016: 4724-4732.
12	CAO Xudong. A practical theory for designing very deep convolutional neural network[R/OL].[2018-08-01]. https://www.kaggle.com/c/datasciencebowl/discussion/13166.
13	YU Fisher, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv: 1511.07122, 2015. https://arxiv.org/abs/1511.07122.
14	WANG Panqu, CHEN Pengfei, YUAN Ye, et al. Understanding convolution for semantic segmentation[C]//2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Nevada, USA: IEEE Computer Society, 2018: 1451-1460.
15	CHEN Liangchieh, ZHU Yukun, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[J]. arXiv preprint arXiv: 1802.02611, 2018. https://arxiv.org/abs/1802.02611
16	REDMON J, FARHADI A. Yolo9000: better, faster, stronger[C]//Computer Vision and Pattern Recognition. Washington, USA: IEEE Computer Society, 2017: 6517-6525.
17	SHEN Zhiqiang, LIU Zhuang, LI Jianguo, et al. Dsod: learning deeply supervised object detectors from scratch[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice, Italia: IEEE Computer Society, 2017: 1919-1927.
18	WANG R J, LI Xiang, LING C X. Pelee: a real-time object detection system on mobile devices[C]//Advances in Neural Information Processing Systems. Montreal, Canada: Neural Information Processing Sys-tems Foundation, Inc., 2018: 1963-1972.
19	LI Yuxi, LI Jiuwei, LIN Weiyao, et al. Tiny-dsod: lightweight object detection for resource-restricted usages[J]. arXiv preprint arXiv: 1807.11013, 2018. https://arxiv.org/abs/1807.11013.
20	ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Computer Society, 2018: 8697-8710.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

层级	输入尺寸	卷积核尺寸	步长	输出尺寸	感受野
Conv5	38×38	1	1	38×38	43
Conv11	19×19	1	1	19×19	219
Conv13	10×10	1	1	10×10	315
Conv14-2	10×10	3	2	5×5	379
Conv15-2	5×5	3	2	3×3	507
Conv16-2	3×3	3	2	2×2	763
Conv17-2	2×2	3	2	1×1	1 275

方法	移除Conv16	复合膨胀卷积	通道重排	残差连接	mAP/%
MobileNet					72.7
对照组1	√				73.0
对照组2	√	√			74.0
对照组3	√	√	√		74.2
DMSSD	√	√	√	√	74.4

方法	MACs/ 10⁹	Parameters/ 10⁶	GPU Inference/ ms	mAP/ %
Tiny-YOLO^[16]	3.49	15.86	6.85	57.1
DSOD_smallest^[17]	5.29	5.90	18.74	73.6
Pelee^[18]	1.21	5.43	14.17	76.4
MobileNet-SSD^[1]	1.16	5.77	5.92	72.7
DMSSD	1.25	5.76	6.50	74.4

方法	飞机	自行车	鸟	船	瓶子	大巴	轿车	猫	椅子	奶牛	桌子	狗	马	摩托	人	植物	绵羊	沙发	火车	电视
MobileNet-SSD^[1]	73.9	82.4	71.1	61.2	39.1	82.6	80.2	88.2	53.8	67.8	78.4	80.8	87.9	85.6	76.5	43.4	65.0	79.4	86.7	69.6
DMSSD	74.7	83.8	71.3	59.7	40.8	84.6	81.0	88.6	56.8	72.6	77.9	83.2	89.2	86.5	77.3	45.5	71.8	79.7	87.8	75.6

类别	训练集	测试集	合计
图片	5 224	2 582	7 806
目标	10 920	4 130	15 050

基于轻型卷积神经网络的火焰检测方法

Fire detection based on lightweight convolutional neural network

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 20

相关文章 1

多维度评价

本文评价

推荐阅读 10

模型	MACs/ 10⁹	Parameters/ 10⁶	FPS/ (帧·s^-1)	mAP/ %
MobileNet-SSD^[1]	1.13	5.52	84	74.3
DMSSD	1.22	5.49	80	78.1

视频	总帧数	MobileNet-SSD^[1]		Tiny-YOLO^[16]		Pelee^[18]		Tiny-DSOD^[19]		DMSSD
视频	总帧数	TP/%	FP/%	TP/%	FP/%	TP/%	FP/%	TP/%	FP/%	TP/%	FP/%
Video1	200	96.0	4.0	94.5	5.5	98.5	1.5	98.0	2.0	97.5	2.5
Video2	216	95.8	4.2	96.8	3.2	98.1	1.9	97.7	2.3	97.2	2.8
Video3	439	90.7	9.3	84.1	15.9	93.2	6.8	91.8	8.2	92.5	7.5
Video4	170	94.7	5.3	97.1	2.9	97.6	2.4	95.3	4.7	96.5	3.5
Video5	595	92.9	7.1	84.5	15.5	96.1	3.9	92.4	7.6	95.0	5.0
Video6	470	92.1	7.9	72.3	27.7	94.7	5.3	91.5	8.8	91.9	8.1
平均	348.3	93.7	6.3	88.2	11.8	96.4	3.6	94.5	5.5	95.1	4.9