您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2019, Vol. 49 ›› Issue (5): 98-104.doi: 10.6040/j.issn.1672-3961.0.2018.348

• 机器学习与数据挖掘 • 上一篇    下一篇

基于F-PointNet的3D点云数据目标检测

万鹏()   

  1. 南京理工大学计算机科学与工程学院, 江苏 南京 210094
  • 收稿日期:2018-08-14 出版日期:2019-10-20 发布日期:2019-10-18
  • 作者简介:万鹏(1995—),男,江西抚州人,硕士研究生,主要研究方向为智能计算与系统.E-mail:18205151102@163.com

Object detection of 3D point clouds based on F-PointNet

Peng WAN()   

  1. School of Computer Science and Engineering, Nanjing University of Science and Technology, NanJing 210094, Jiangsu, China
  • Received:2018-08-14 Online:2019-10-20 Published:2019-10-18

摘要:

针对目前3D点云目标检测模型检测精度不高的问题,研究使用直接处理点云数据的F-PointNet模型检测汽车、行人和骑车人,并对模型进行微调,进一步提升模型的目标检测精度。试验中使用不同的参数初始化、$\ell $2正则化和修改卷积核数的方法对模型进行测试。试验结果表明, Xavier参数初始化方法收敛速度比截断正态分布方法快0.09 s,同时汽车和骑车人检测精度分别高出大约3%和2%;增加$\ell $2正则化,行人检测精度和骑车人检测精度可提高大约2%和1%;对T-Net(Transfrmer Networks)第一层卷积层的卷积核数减少为128后,汽车和骑车人检测精度分别提高了大约1%和2%,表明本模型能有效地提升目标检测精度。

关键词: 深度学习, 3D点云数据, 目标检测, 检测精度, F-PointNet模型

Abstract:

Aiming at the problem of poor detection accuracy of the current 3D point cloud object detection model, the F-PointNet model, which directly processed point cloud data, was used to detect cars, pedestrians and cyclists, and the model was fine-tuned to further improve the object detection accuracy. The model was tested by different parameter initialization methods, $\ell $2 regularization and modifying convolution kernels. The experimental results showed that the Xavier parameter initialization method converged faster 0.09s than the truncated normal distribution method, and the vehicle detection accuracy and the cyclists detection accuracy was about 3% and 2% higher respectively. By adding $\ell $2 regularization, the detection accuracy of pedestrians and cyclists was increased by about 2% and 1% respectively. By reducing the number of convolution kernels in the first layer of T-Net (Transformer Networks) to 128, the detection accuracy of cars and cyclists was increased by about 1% and 2% respectively, which confirmed that the model could effectively improve object detection accuracy.

Key words: deep learning, 3D point cloud, object detection, detection accuracy, F-PointNet model

中图分类号: 

  • TP249

图1

F-PointNet模型架构"

图2

点云坐标图"

图3

3D实例分割PointNet模型架构"

图4

T-Net模型架构"

图5

非模态边界框评估PointNet模型架构"

图6

更改后的T-Net模型架构"

表1

不同初始化方法的目标精度结果"

%
初始化方法 运行时间/s 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
Xavier 0.13 85.18 71.73 63.85 65.77 55.49 49.66 70.62 52.97 49.97
截断正态 0.22 82.24 68.84 60.86 67.29 56.54 50.00 68.03 50.24 46.91

表2

不同衰减率下的目标检测精度"

%
正则化系数 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
0.000 5 84.66 70.69 63.37 69.08 59.09 51.68 67.94 51.24 47.33
0.001 0 83.47 69.39 62.81 66.34 55.85 49.18 68.52 51.47 47.96
0.010 0 85.53 70.89 62.81 67.38 58.05 50.61 71.18 53.62 50.12
0 85.71 71.73 63.85 65.77 55.49 49.66 70.62 52.97 49.97

图7

学习率随批处理次数变化曲线"

图8

总损失随批处理次数变化曲线"

表3

本文模型与其它模型检测精度的对比"

%
模型 汽车 行人 骑车人
简单 中等 困难 简单 中等 困难 简单 中等 困难
MV3D 71.29 62.68 56.56
AVOD 84.41 74.44 68.65
VoxelNet 81.97 65.46 62.85 57.86 53.42 48.87 67.17 47.65 45.11
F-PointNet 84.73 70.56 62.37 67.26 57.37 50.28 68.61 50.97 47.40
Ours 85.71 71.73 63.85 66.77 56.61 49.66 70.62 52.97 49.97

图9

图像、原始点云和含有边界框的点云可视化对比"

1 薛瑞.基于RGB-D数据的点云配准[D].西安:长安大学, 2017.
XUE Rui. Point cloud registration based on RGB-D data[D]. Xi'an: Chang'an University, 2017.
2 赵熙.基于地面激光扫描面点云数据的三维重建方法研究[D].武汉:武汉大学, 2010.
ZHAO Xi. Research on 3D reconstruction method based on surface laser scanning point cloud data[D]. Wuhan: Wuhan University, 2010.
3 MATURAN D, SCHERER S. VoxNet: a 3D convolutional neural network for real-time object recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Hamburg, Germany: IEEE Press, 2015: 922-928.
4 WU Z, SONG S, KHOSLA A, et al. 3d shapenets: a deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE Press, 2015: 1912-1920.
5 LI B. 3D fully convolutional network for vehicle detection in point cloud[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver, Canada: IEEE Press, 2017: 1513-1518.
6 WANG D Z, POSNER I, WANG D Z, et al. Voting for voting in online point cloud object detection[C]//Robotics: Science and Systems. Rome, Italy: IEEE Press, 2015: 1317-1325.
7 ENGELCKE M , RAO D , WANG D Z , et al. Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks[J]. ICRA, 2016, 1609, 1355- 1361.
8 LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39 (4): 640- 651.
9 QI C R, SU H, NIWBNER M, et al. Volumetric and multi-view cnns for object classification on 3d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE Press, 2016: 5648-5656.
10 SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3d shape recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE Press, 2015: 945-953.
11 LI B, ZHANG T, XIA T. Vehicle detection from 3D lidar using fully convolutional network[C]//Robotics: Science and System. Ann Arbor, USA: IEEE Press, 2016: 1608-1616.
12 CHEN X , MA H , WAN J , et al. Multi-view 3D object detection network for autonomous driving[J]. Computer Vision and Pattern Recognition(CVPR), 2016, (10): 6526- 6534.
13 GONZALEZ A , VAZQUEZ D , LOPEZ A M , et al. On-board object detection: Multicue, multimodal, and multiview random forest of local experts[J]. IEEE Transactions on Cybernetics, 2017, 47 (11): 3980- 3990.
doi: 10.1109/TCYB.2016.2593940
14 ENZWEILER M , GAVRILA D M . A multilevel mixture-of-experts framework for pedestrian classification[J]. Image Processing IEEE Transactions, 2011, 20 (10): 2967- 2979.
doi: 10.1109/TIP.2011.2142006
15 QI C R, LIU W, WU C, et al. Frustum pointnets for 3d object detection from rgb-d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 918-927.
16 CHARLES R Q, SU H, MO K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE Press, 2017: 652-660.
17 GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving: the KITTI vision benchmark suite[C]//IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE Computer Society, 2012: 3354-3361.
18 GEIGER A , LENZ P , STILLER C , et al. Vision meets robotics: the KITTI dataset[J]. International Journal of Robotics Research, 2013, 32 (11): 1231- 1237.
doi: 10.1177/0278364913491297
19 ZHOU Y, TUZEL O. Voxelnet: end-to-end learning for point cloud based 3d object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE Press, 2018: 4490-4499.
20 KU J, MOZIFIAN M, LEE J, et al. Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Madrid, Spain: IEEE Press, 2018: 1-8.
[1] 李常刚,李宝亮,曹永吉,王佳颖. 人工智能在电力系统潮流计算中的应用综述及展望[J]. 山东大学学报 (工学版), 2025, 55(5): 1-17.
[2] 索大翔,李波. 细粒度特征增强与尺寸匹配的光伏缺陷检测[J]. 山东大学学报 (工学版), 2025, 55(4): 9-17.
[3] 周群颖,隋家成,张继,王洪元. 基于自监督卷积和无参数注意力机制的工业品表面缺陷检测[J]. 山东大学学报 (工学版), 2025, 55(4): 40-47.
[4] 薛冰冰,王勇,杨维浩,王川,于迪,王旭. 基于ETC收费数据的高速公路交通流数据修复及实时预测[J]. 山东大学学报 (工学版), 2025, 55(3): 58-71.
[5] 董明书,陈俐企,马川义,张珠皓,孙仁娟,管延华,庄培芝. 沥青路面内部裂缝雷达图像智能判识算法研究[J]. 山东大学学报 (工学版), 2025, 55(3): 72-79.
[6] 聂秀山,赵润虎,宁阳,刘新锋. 开放词汇目标检测方法综述[J]. 山东大学学报 (工学版), 2025, 55(1): 1-14.
[7] 张曼,孙凯军,李翔,孙纪舟. 融合FasterNet和RepVGG的安全设备佩戴检测方法[J]. 山东大学学报 (工学版), 2024, 54(6): 19-28.
[8] 常新功,苏敏惠,周志刚. 基于进化集成的图神经网络解释方法[J]. 山东大学学报 (工学版), 2024, 54(4): 1-12.
[9] 索大翔,李波. 基于Gromov-Wasserstein最优传输的输电线路小目标检测方法[J]. 山东大学学报 (工学版), 2024, 54(3): 22-29.
[10] 宋辉,张轶哲,张功萱,孟元. 基于类权重和最小化预测熵的测试时集成方法[J]. 山东大学学报 (工学版), 2024, 54(3): 36-43.
[11] 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法[J]. 山东大学学报 (工学版), 2024, 54(3): 55-63.
[12] 聂秀山,巩蕊,董飞,郭杰,马玉玲. 短视频场景分类方法综述[J]. 山东大学学报 (工学版), 2024, 54(3): 1-11.
[13] 陈晓燕,王川,齐明杰,张宁,林晓龙,霍延强,刘世杰,田源. 采用雷视融合方法的灌溉风险区异物入侵风险预警[J]. 山东大学学报 (工学版), 2024, 54(3): 115-121.
[14] 李璐,张志军,范钰敏,王星,袁卫华. 面向冷启动用户的元学习与图转移学习序列推荐[J]. 山东大学学报 (工学版), 2024, 54(2): 69-79.
[15] 高泽文,王建,魏本征. 基于混合偏移轴向自注意力机制的脑胶质瘤分割算法[J]. 山东大学学报 (工学版), 2024, 54(2): 80-89.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 岳远征. 远离平衡态玻璃的弛豫[J]. 山东大学学报(工学版), 2009, 39(5): 1 -20 .
[2] 李辉平, 赵国群, 张雷, 贺连芳. 超高强度钢板热冲压及模内淬火工艺的发展现状[J]. 山东大学学报(工学版), 2010, 40(3): 69 -74 .
[3] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[4] 李术才,王兆清,李树忱 . 基于无理函数插值的多边形有限元方法[J]. 山东大学学报(工学版), 2008, 38(2): 66 -70 .
[5] 阮静,吉林,祝金鹏 . 三塔悬索桥中塔结构选型分析[J]. 山东大学学报(工学版), 2008, 38(2): 106 -111 .
[6] 薛一冰 孟光 张乐. 两种太阳能空气集热器性能比较研究[J]. 山东大学学报(工学版), 2009, 39(6): 147 -149 .
[7] 牟薪苇,谢绍斌,鞠占生 . 短波地空通信链路电磁计算与仿真[J]. 山东大学学报(工学版), 2007, 37(6): 71 -73 .
[8] 邱道宏1,张乐文1,崔伟2,苏茂鑫1,孙怀凤1. 基于趋势检查法的遗传神经网络模型及工程应用[J]. 山东大学学报(工学版), 2010, 40(3): 113 -118 .
[9] 王娟,陈慧岩,丁华荣 . 液力机械自动变速箱起步过程控制[J]. 山东大学学报(工学版), 2008, 38(2): 23 -27 .
[10] 路 冬,李剑峰,孙 杰,姜 峰 . 航空框类零件加工动态夹紧力确定有限元分析[J]. 山东大学学报(工学版), 2007, 37(1): 19 -22 .