基于单静态图像的深度感知模型

doi:10.6040/j.issn.1672-3961.2.2015.007

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 37-43.doi: 10.6040/j.issn.1672-3961.2.2015.007

基于单静态图像的深度感知模型

孟令恒^1,2,丁世飞^1,2*

1. 中国矿业大学计算机科学与技术学院, 江苏徐州 221116;2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190

收稿日期:2015-06-23 出版日期:2016-06-30 发布日期:2015-06-23
通讯作者: 丁世飞(1963— ),男,山东青岛人,博士,教授,主要研究方向为模式识别与人工智能,机器学习与数据挖掘,粗糙集与软计算,粒度计算,感知与认知计算.E-mail:dingsf@cumt.edu.cn E-mail:LinghengMeng@yahoo.com
作者简介:孟令恒(1991— ),男,山东滕州人,硕士研究生,主要研究方向为人工智能,机器学习,计算机视觉.E-mail:LinghengMeng@yahoo.com
基金资助:
国家自然科学基金资助项目(61379101);国家重点基础研究发展规划资助项目(2013CB329502)

Depth perceptual model based on the single image

MENG Lingheng^1,2, DING Shifei^1,2*

1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China;
2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China

Received:2015-06-23 Online:2016-06-30 Published:2015-06-23

摘要/Abstract

摘要： 为解决立体视觉深度感知模型的昂贵计算代价问题,提出以机器学习算法为主要依托的基于单静态图像的深度感知模型。研究基于单静态图像的深度感知模型的形式化表示、多尺度空间图像特征的选择,并将该模型应用于深度图的预测,以及利用该模型预测到的深度图进行3D场景重构。试验结果表明,基于单静态图像的深度感知模型可以获得较好的深度预测精度、较快的预测速度以及比较理想的重构模型。

关键词: 深度感知, 图像处理, 3D重构, 机器学习, 马尔科夫随机场

Abstract: In order to overcome the expensive cost of stereo vision based on depth perceptual models, the single image based on depth perceptual model which mainly supported by machine learning algorithms was proposed. The formula presentation of the single image based on the depth perceptual model and the selecting of multi-scale image features was studied, and this model was used to predict depth image, furthermore the depth image was utilized to reconstruct the 3D scene. The experiments showed that single image based on depth perceptual model could make well predictive precise, faster predictive speed, and better reconstruction results.

Key words: machine learning, Markov random field, depth perception, 3D-reconstruction, image processing

中图分类号:

TP391

孟令恒,丁世飞. 基于单静态图像的深度感知模型[J]. 山东大学学报(工学版), 2016, 46(3): 37-43.

MENG Lingheng, DING Shifei. Depth perceptual model based on the single image[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(3): 37-43.

参考文献

[1] 李乐. 面向3DTV应用的视频2D转3D技术研究[D]. 长沙: 国防科学技术大学, 2012. LI Le. 2D to 3D conversion method for 3DTV application[J]. Changsha: National University of Defense Technology, 2012.
[2] 李乐,张茂军,熊志辉,等. 基于内容理解的单幅静态街景图像深度估计[J]. 机器人, 2011(2):174-180. LI Le, ZHANG Maojun, XIONG Zhihui, et al. Depth estimation from a single still image of street scene based on content understanding[J].Robot, 2011(2):174-180.
[3] HOIEM D, EFROS A A, HEBERT M. Putting objects in perspective[J]. International Journal of Computer Vision, 2008, 80(1):3-15.
[4] HOIEM D, EFROS A, HEBERT M. Geometric context from a single image[C] //Proceedings of the Tenth IEEE International Conference on Computer Vision(ICCV'05).Beijing:IEEE, 2005:654-661.
[5] GUO R, HOIEM D. Support surface prediction in indoor scenes[C] //Computer Vision and Pattern Recognition(CVPR).Sydney, NSW: IEEE, 2013:2144-2151.
[6] SAXENA A, CHUNG S H, NG A Y. 3-D Depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1):53-69.
[7] KOPPULA H S, SAXENA A. Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation[C] //Proceedings of the 30th International Conference on Machine Learning(ICML-13).Atlanta, USA: JMLR, 2013:792-800.
[8] SAXENA A, JAMIE S, NG A Y. Depth estimation using monocular and stereo cues [C] //International Joint Conferences on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 2007:2197-2203.
[9] SAXENA A, SUN M, NG A Y. Make 3D: depth perception from a single still image[C] // Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, Illinois: AAAI Press, 2008:1571-1576.
[10] FREUEH C, ZAKHOR A. Constructing 3D city models by merging ground-based and airborne views[J].IEEE Computer Graphics and Applications, 2003, 23(6):52-61.
[11] OSWALD MRC. A convex relaxation approach to space time multi-view 3D reconstruction[C] //Computer Vision Workshops(ICCVW), 2013 IEEE International Conference. Sydney, NSW:IEEE, 2013:291-298.
[12] CRIMINISI A. Single-view metrology[C] //Algorithms and Applications:Proceedings of the 24th DAGM Symposium on Pattern Recognition. London, UK: Springer-Verlag, 2002:16.
[13] CRIMINISI A, REID I, ZISSERMAN A. Single view metrology[J]. International Journal of Computer Vision, 2000, 40(2):123-148.
[14] MICHELS J, SAXENA A, NG A Y. High speed obstacle avoidance using monocular vision and reinforcement learning[C] // Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005:593-600.
[15] KOLLER D, FRIEDMAN N.Probabilistic graphical models: principles and techniques[M]. Cambridge, Massachusetts:MIT Press, 2009.
[16] FELZENSZMALB P F, HUTTENLOCHER D P. Efficient graph-based image segmentation[J]. Int. J. Computer Vision, 2004, 59(2): 167-181.
[17] CORMEN T H, STENIN C, RIVEST R L, et al.Introduction to algorithms[M]. 3 Edition.Cambridge MA:The MIT Press, 2009.
[18] Wikimedia Foundation, Inc.YCbCr[EB/OL].(2015-10-29)[2015-11-10]. https://en.wikipedia.org/wiki/YCbCr.
[19] 葛亮,朱庆生,傅思思. Laws纹理模板在立体匹配中的应用[J]. 光学学报, 2009, 29(9):2506-2510. GE Liang, ZHU Qingsheng, FU Sisi. Application of law's masks to stereo matching[J]. Acta Optica Sinica, 2009, 29(9):2506-2510.
[20] DAVIES E R. Computer andmachine vision:theory, algorithms, practicalities[M]. 4th ed. New York:Academic Press, 2012.
[21] WILLSKY A S. Multiresolution Markov models for signal and image processing[J]. Proceedings of the IEEE, 2002, 90(8):1396-1458.
[22] LI S Z. Markov random field modeling in image analysis[M]. 3rd ed. London: Springer, 2009:357.
[23] Wikimedia Foundation, Inc. Kinect[EB/OL].(2015-10-11)[2015-11-10]. https://en.wikipedia.org/wiki/Kinect.
[24] NAGAI T, IKEHARA M, KUREMATSU A. HMM-based surface reconstruction from single images[J]. Systems and Computers in Japan, 2007, 38(11):80-89.
[25] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York, USA: ACM, 2001:341-346.

相关文章 15

[1]	祝明,石承龙,吕潘,刘现荣,孙驰,陈建城,范宏运. 基于优化长短时记忆网络的深基坑变形预测方法及其工程应用[J]. 山东大学学报 (工学版), 2025, 55(3): 141-148.
[2]	鲁志恒,霍延强,韩汶,杜聪,刘轶鹏,张宏博. 基于图像数据和碎石集料级配与用量的碎石集料空隙率快速检测方法[J]. 山东大学学报 (工学版), 2024, 54(6): 89-99.
[3]	常新功,苏敏惠,周志刚. 基于进化集成的图神经网络解释方法[J]. 山东大学学报 (工学版), 2024, 54(4): 1-12.
[4]	乔慧妍,段学龙,解驰皓,赵冬慧,马玉玲. 基于异常点检测的心理健康辅助诊断方法[J]. 山东大学学报 (工学版), 2024, 54(4): 76-85.
[5]	刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法[J]. 山东大学学报 (工学版), 2024, 54(3): 55-63.
[6]	岳仁峰,张嘉琦,刘勇,范学忠,李琮琮,孔令鑫. 基于颜色和纹理特征的立体车库锈蚀检测技术[J]. 山东大学学报 (工学版), 2024, 54(3): 64-69.
[7]	陈成,董永权,贾瑞,刘源. 基于交互序列特征相关性的可解释知识追踪[J]. 山东大学学报 (工学版), 2024, 54(1): 100-108.
[8]	卞小曼,王小琴,蓝如师,刘振丙,罗笑南. 基于相似性保持和判别性分析的快速视频哈希算法[J]. 山东大学学报 (工学版), 2023, 53(6): 63-69.
[9]	李鸿钊,张庆松,刘人太,陈新,辛勤,石乐乐. 浅埋地铁车站施工期地表变形风险预警[J]. 山东大学学报 (工学版), 2023, 53(6): 82-91.
[10]	袁高腾,周晓峰,郭宏乐. 基于特征选择算法的ECG信号分类[J]. 山东大学学报 (工学版), 2022, 52(4): 38-44.
[11]	韩天雨,路长厚,李建美,尹昂,侯秋林. 利用图像处理技术测量丝杠螺距的机器视觉系统[J]. 山东大学学报 (工学版), 2022, 52(3): 80-85.
[12]	聂秀山,马玉玲,乔慧妍,郭杰,崔超然,于志云,刘兴波,尹义龙. 任务粒度视角下的学生成绩预测研究综述[J]. 山东大学学报 (工学版), 2022, 52(2): 1-14.
[13]	王心哲,邓棋文,王际潮,范剑超. 深度语义分割MRF模型的海洋筏式养殖信息提取[J]. 山东大学学报 (工学版), 2022, 52(2): 89-98.
[14]	孙鸿昌,周风余,单明珠,翟文文,牛兰强. 基于模式划分的空调能耗混合填补方法[J]. 山东大学学报 (工学版), 2022, 52(1): 9-18.
[15]	宋怀雷, 邬忠虎, 李利平, 娄义黎, 孙文吉斌, 刘镐, 左宇军. 基于数字图像的微观尺度下方解石脉对页岩各向异性的影响[J]. 山东大学学报 (工学版), 2021, 51(5): 91-99.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于单静态图像的深度感知模型

Depth perceptual model based on the single image

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

多维度评价

本文评价

推荐阅读 10