您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2019, Vol. 49 ›› Issue (5): 98-104.doi: 10.6040/j.issn.1672-3961.0.2018.348

• 机器学习与数据挖掘 • 上一篇    下一篇

基于F-PointNet的3D点云数据目标检测

万鹏()   

  1. 南京理工大学计算机科学与工程学院, 江苏 南京 210094
  • 收稿日期:2018-08-14 出版日期:2019-10-20 发布日期:2019-10-18
  • 作者简介:万鹏(1995—),男,江西抚州人,硕士研究生,主要研究方向为智能计算与系统.E-mail:18205151102@163.com

Object detection of 3D point clouds based on F-PointNet

Peng WAN()   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, NanJing 210094, Jiangsu, China
  • Received:2018-08-14 Online:2019-10-20 Published:2019-10-18

摘要:

针对目前3D点云目标检测模型检测精度不高的问题,研究使用直接处理点云数据的F-PointNet模型检测汽车、行人和骑车人,并对模型进行微调,进一步提升模型的目标检测精度。试验中使用不同的参数初始化、$\ell $2正则化和修改卷积核数的方法对模型进行测试。试验结果表明, Xavier参数初始化方法收敛速度比截断正态分布方法快0.09 s,同时汽车和骑车人检测精度分别高出大约3%和2%;增加$\ell $2正则化,行人检测精度和骑车人检测精度可提高大约2%和1%;对T-Net(Transfrmer Networks)第一层卷积层的卷积核数减少为128后,汽车和骑车人检测精度分别提高了大约1%和2%,表明本模型能有效地提升目标检测精度。

关键词: 深度学习, 3D点云数据, 目标检测, 检测精度, F-PointNet模型

Abstract:

Aiming at the problem of poor detection accuracy of the current 3D point cloud object detection model, the F-PointNet model, which directly processed point cloud data, was used to detect cars, pedestrians and cyclists, and the model was fine-tuned to further improve the object detection accuracy. The model was tested by different parameter initialization methods, $\ell $2 regularization and modifying convolution kernels. The experimental results showed that the Xavier parameter initialization method converged faster 0.09s than the truncated normal distribution method, and the vehicle detection accuracy and the cyclists detection accuracy was about 3% and 2% higher respectively. By adding $\ell $2 regularization, the detection accuracy of pedestrians and cyclists was increased by about 2% and 1% respectively. By reducing the number of convolution kernels in the first layer of T-Net (Transformer Networks) to 128, the detection accuracy of cars and cyclists was increased by about 1% and 2% respectively, which confirmed that the model could effectively improve object detection accuracy.

Key words: deep learning, 3D point cloud, object detection, detection accuracy, F-PointNet model

中图分类号: 

  • TP249

图1

F-PointNet模型架构"

图2

点云坐标图"

图3

3D实例分割PointNet模型架构"

图4

T-Net模型架构"

图5

非模态边界框评估PointNet模型架构"

图6

更改后的T-Net模型架构"

表1

不同初始化方法的目标精度结果"

%
初始化方法 运行时间/s 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
Xavier 0.13 85.18 71.73 63.85 65.77 55.49 49.66 70.62 52.97 49.97
截断正态 0.22 82.24 68.84 60.86 67.29 56.54 50.00 68.03 50.24 46.91

表2

不同衰减率下的目标检测精度"

%
正则化系数 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
0.000 5 84.66 70.69 63.37 69.08 59.09 51.68 67.94 51.24 47.33
0.001 0 83.47 69.39 62.81 66.34 55.85 49.18 68.52 51.47 47.96
0.010 0 85.53 70.89 62.81 67.38 58.05 50.61 71.18 53.62 50.12
0 85.71 71.73 63.85 65.77 55.49 49.66 70.62 52.97 49.97

图7

学习率随批处理次数变化曲线"

图8

总损失随批处理次数变化曲线"

表3

本文模型与其它模型检测精度的对比"

%
模型 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
MV3D 71.29 62.68 56.56
AVOD 84.41 74.44 68.65
VoxelNet 81.97 65.46 62.85 57.86 53.42 48.87 67.17 47.65 45.11
F-PointNet 84.73 70.56 62.37 67.26 57.37 50.28 68.61 50.97 47.40
Ours 85.71 71.73 63.85 66.77 56.61 49.66 70.62 52.97 49.97

图9

图像、原始点云和含有边界框的点云可视化对比"

1 薛瑞.基于RGB-D数据的点云配准[D].西安:长安大学, 2017.
XUE Rui. Point cloud registration based on RGB-D data[D]. Xi'an: Chang'an University, 2017.
2 赵熙.基于地面激光扫描面点云数据的三维重建方法研究[D].武汉:武汉大学, 2010.
ZHAO Xi. Research on 3D reconstruction method based on surface laser scanning point cloud data[D]. Wuhan: Wuhan University, 2010.
3 MATURAN D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Hamburg, Germany: IEEE Press, 2015: 922-928.
4 WU Z, SONG S, KHOSLA A, et al. 3d shapenets: a deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE Press, 2015: 1912-1920.
5 LI B. 3D fully convolutional network for vehicle detection in point cloud[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE Press, 2017: 1513-1518.
6 WANG D Z, POSNER I, WANG D Z, et al. Voting for voting in online point cloud object detection[C]//Robotics: Science and Systems. Rome, Italy: IEEE Press, 2015: 1317-1325.
7 ENGELCKE M , RAO D , WANG D Z , et al. Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks[J]. ICRA, 2016, 1609, 1355- 1361.
8 LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39 (4): 640- 651.
9 QI C R, SU H, NIWBNER M, et al. Volumetric and multi-view cnns for object classification on 3d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 5648-5656.
10 SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3d shape recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE Press, 2015: 945-953.
11 LI B, ZHANG T, XIA T. Vehicle detection from 3D lidar using fully convolutional network[C]//Robotics: Science and System. Ann Arbor, USA: IEEE Press, 2016: 1608-1616.
12 CHEN X , MA H , WAN J , et al. Multi-view 3D object detection network for autonomous driving[J]. Computer Vision and Pattern Recognition(CVPR), 2016, (10): 6526- 6534.
13 GONZALEZ A , VAZQUEZ D , LOPEZ A M , et al. On-board object detection: Multicue, multimodal, and multiview random forest of local experts[J]. IEEE Transactions on Cybernetics, 2017, 47 (11): 3980- 3990.
doi: 10.1109/TCYB.2016.2593940
14 ENZWEILER M , GAVRILA D M . A multilevel mixture-of-experts framework for pedestrian classification[J]. Image Processing IEEE Transactions, 2011, 20 (10): 2967- 2979.
doi: 10.1109/TIP.2011.2142006
15 QI C R, LIU W, WU C, et al. Frustum pointnets for 3d object detection from rgb-d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 918-927.
16 CHARLES R Q, SU H, MO K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Press, 2017: 652-660.
17 GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving: the KITTI vision benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE Computer Society, 2012: 3354-3361.
18 GEIGER A , LENZ P , STILLER C , et al. Vision meets robotics: the KITTI dataset[J]. International Journal of Robotics Research, 2013, 32 (11): 1231- 1237.
doi: 10.1177/0278364913491297
19 ZHOU Y, TUZEL O. Voxelnet: end-to-end learning for point cloud based 3d object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 4490-4499.
20 KU J, MOZIFIAN M, LEE J, et al. Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid, Spain: IEEE Press, 2018: 1-8.
[1] 梁志祥,刘晓明,牟颖,刘玉田. 基于深度学习的新能源爬坡事件预测方法[J]. 山东大学学报 (工学版), 2019, 49(5): 24-28.
[2] 刘玉田,孙润稼,王洪涛,顾雪平. 人工智能在电力系统恢复中的应用综述[J]. 山东大学学报 (工学版), 2019, 49(5): 1-8.
[3] 张继,金翠,王洪元,陈首兵. 基于奇异值分解行人对齐网络的行人重识别[J]. 山东大学学报 (工学版), 2019, 49(5): 91-97.
[4] 李力钊,蔡国永,潘角. 基于C-GRU的微博谣言事件检测方法[J]. 山东大学学报 (工学版), 2019, 49(2): 102-106, 115.
[5] 侯霄雄,许新征,朱炯,郭燕燕. 基于AlexNet和集成分类器的乳腺癌计算机辅助诊断方法[J]. 山东大学学报 (工学版), 2019, 49(2): 74-79.
[6] 张成彬,赵慧,曹宗钰. 基于深度学习的车身网络KWP2000协议漏洞挖掘[J]. 山东大学学报 (工学版), 2019, 49(2): 17-22.
[7] 谢志峰,吴佳萍,马利庄. 基于卷积神经网络的中文财经新闻分类方法[J]. 山东大学学报(工学版), 2018, 48(3): 34-39.
[8] 唐乐爽,田国会,黄彬. 一种基于DSmT推理的物品融合识别算法[J]. 山东大学学报(工学版), 2018, 48(1): 50-56.
[9] 周福娜,高育林,王佳瑜,文成林. 基于深度学习的缓变故障早期诊断及寿命预测[J]. 山东大学学报(工学版), 2017, 47(5): 30-37.
[10] 惠开发,成科扬,詹永照. 基于改进ViBe算法的视频浓缩[J]. 山东大学学报(工学版), 2017, 47(3): 43-48.
[11] 刘英霞,王希常,唐晓丽,常发亮. 基于小波域特征和贝叶斯估计的目标检测算法[J]. 山东大学学报(工学版), 2017, 47(2): 63-70.
[12] 何正义,曾宪华,曲省卫,吴治龙. 基于集成深度学习的时间序列预测模型[J]. 山东大学学报(工学版), 2016, 46(6): 40-47.
[13] 郑毅, 朱成璋. 基于深度信念网络的PM2.5预测[J]. 山东大学学报(工学版), 2014, 44(6): 19-25.
[14] 杨健梅1,黄添强1,2*,江伟坚1. 基于人脸色温的拼接图像篡改检测[J]. 山东大学学报(工学版), 2013, 43(5): 24-30.
[15] 王秀芬,王汇源,王松. 基于背景差分法和显著性图的海底目标检测方法[J]. 山东大学学报(工学版), 2011, 41(1): 12-16.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[2] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[3] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[4] 徐丽丽,季忠,夏继梅 . 同规格货物装箱问题的优化计算[J]. 山东大学学报(工学版), 2008, 38(3): 14 -17 .
[5] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[6] 徐晓丹, 段正杰, 陈中育. 基于扩展情感词典及特征加权的情感挖掘方法[J]. 山东大学学报(工学版), 2014, 44(6): 15 -18 .
[7] 王学平,王登杰,孙英明,董磊 . 免棱镜全站仪在桥梁检测中的应用[J]. 山东大学学报(工学版), 2007, 37(3): 105 -108 .
[8] 张波,李术才,杨学英,王锡平,张敦福 . 含两个圆形孔洞岩盐路基稳定性的数值分析[J]. 山东大学学报(工学版), 2008, 38(1): 66 -69 .
[9] 李术才,王兆清,李树忱 . 基于无理函数插值的多边形有限元方法[J]. 山东大学学报(工学版), 2008, 38(2): 66 -70 .
[10] 曹刚 董朝阳 黄洁宝 薛禹胜. 应用FACTS装置实现电力系统区间震荡阻尼控制[J]. 山东大学学报(工学版), 2009, 39(3): 31 -36 .