您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 37-43.doi: 10.6040/j.issn.1672-3961.2.2015.007

• • 上一篇    下一篇

基于单静态图像的深度感知模型

孟令恒1,2,丁世飞1,2*   

  1. 1. 中国矿业大学计算机科学与技术学院, 江苏 徐州 221116;2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190
  • 收稿日期:2015-06-23 出版日期:2016-06-30 发布日期:2015-06-23
  • 通讯作者: 丁世飞(1963— ),男,山东青岛人,博士,教授,主要研究方向为模式识别与人工智能,机器学习与数据挖掘,粗糙集与软计算,粒度计算,感知与认知计算.E-mail:dingsf@cumt.edu.cn E-mail:LinghengMeng@yahoo.com
  • 作者简介:孟令恒(1991— ),男,山东滕州人,硕士研究生,主要研究方向为人工智能,机器学习,计算机视觉.E-mail:LinghengMeng@yahoo.com
  • 基金资助:
    国家自然科学基金资助项目(61379101);国家重点基础研究发展规划资助项目(2013CB329502)

Depth perceptual model based on the single image

MENG Lingheng1,2, DING Shifei1,2*   

  1. 1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China;
    2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China
  • Received:2015-06-23 Online:2016-06-30 Published:2015-06-23

摘要: 为解决立体视觉深度感知模型的昂贵计算代价问题,提出以机器学习算法为主要依托的基于单静态图像的深度感知模型。研究基于单静态图像的深度感知模型的形式化表示、多尺度空间图像特征的选择,并将该模型应用于深度图的预测,以及利用该模型预测到的深度图进行3D场景重构。试验结果表明,基于单静态图像的深度感知模型可以获得较好的深度预测精度、较快的预测速度以及比较理想的重构模型。

关键词: 深度感知, 图像处理, 3D重构, 机器学习, 马尔科夫随机场

Abstract: In order to overcome the expensive cost of stereo vision based on depth perceptual models, the single image based on depth perceptual model which mainly supported by machine learning algorithms was proposed. The formula presentation of the single image based on the depth perceptual model and the selecting of multi-scale image features was studied, and this model was used to predict depth image, furthermore the depth image was utilized to reconstruct the 3D scene. The experiments showed that single image based on depth perceptual model could make well predictive precise, faster predictive speed, and better reconstruction results.

Key words: machine learning, Markov random field, depth perception, 3D-reconstruction, image processing

中图分类号: 

  • TP391
[1] 李乐. 面向3DTV应用的视频2D转3D技术研究[D]. 长沙: 国防科学技术大学, 2012. LI Le. 2D to 3D conversion method for 3DTV application[J]. Changsha: National University of Defense Technology, 2012.
[2] 李乐,张茂军,熊志辉,等. 基于内容理解的单幅静态街景图像深度估计[J]. 机器人, 2011(2):174-180. LI Le, ZHANG Maojun, XIONG Zhihui, et al. Depth estimation from a single still image of street scene based on content understanding[J].Robot, 2011(2):174-180.
[3] HOIEM D, EFROS A A, HEBERT M. Putting objects in perspective[J]. International Journal of Computer Vision, 2008, 80(1):3-15.
[4] HOIEM D, EFROS A, HEBERT M. Geometric context from a single image[C] //Proceedings of the Tenth IEEE International Conference on Computer Vision(ICCV'05).Beijing:IEEE, 2005:654-661.
[5] GUO R, HOIEM D. Support surface prediction in indoor scenes[C] //Computer Vision and Pattern Recognition(CVPR).Sydney, NSW: IEEE, 2013:2144-2151.
[6] SAXENA A, CHUNG S H, NG A Y. 3-D Depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1):53-69.
[7] KOPPULA H S, SAXENA A. Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation[C] //Proceedings of the 30th International Conference on Machine Learning(ICML-13).Atlanta, USA: JMLR, 2013:792-800.
[8] SAXENA A, JAMIE S, NG A Y. Depth estimation using monocular and stereo cues [C] //International Joint Conferences on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 2007:2197-2203.
[9] SAXENA A, SUN M, NG A Y. Make 3D: depth perception from a single still image[C] // Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, Illinois: AAAI Press, 2008:1571-1576.
[10] FREUEH C, ZAKHOR A. Constructing 3D city models by merging ground-based and airborne views[J].IEEE Computer Graphics and Applications, 2003, 23(6):52-61.
[11] OSWALD MRC. A convex relaxation approach to space time multi-view 3D reconstruction[C] //Computer Vision Workshops(ICCVW), 2013 IEEE International Conference. Sydney, NSW:IEEE, 2013:291-298.
[12] CRIMINISI A. Single-view metrology[C] //Algorithms and Applications:Proceedings of the 24th DAGM Symposium on Pattern Recognition. London, UK: Springer-Verlag, 2002:16.
[13] CRIMINISI A, REID I, ZISSERMAN A. Single view metrology[J]. International Journal of Computer Vision, 2000, 40(2):123-148.
[14] MICHELS J, SAXENA A, NG A Y. High speed obstacle avoidance using monocular vision and reinforcement learning[C] // Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005:593-600.
[15] KOLLER D, FRIEDMAN N.Probabilistic graphical models: principles and techniques[M]. Cambridge, Massachusetts:MIT Press, 2009.
[16] FELZENSZMALB P F, HUTTENLOCHER D P. Efficient graph-based image segmentation[J]. Int. J. Computer Vision, 2004, 59(2): 167-181.
[17] CORMEN T H, STENIN C, RIVEST R L, et al.Introduction to algorithms[M]. 3 Edition.Cambridge MA:The MIT Press, 2009.
[18] Wikimedia Foundation, Inc.YCbCr[EB/OL].(2015-10-29)[2015-11-10]. https://en.wikipedia.org/wiki/YCbCr.
[19] 葛亮,朱庆生,傅思思. Laws纹理模板在立体匹配中的应用[J]. 光学学报, 2009, 29(9):2506-2510. GE Liang, ZHU Qingsheng, FU Sisi. Application of law's masks to stereo matching[J]. Acta Optica Sinica, 2009, 29(9):2506-2510.
[20] DAVIES E R. Computer andmachine vision:theory, algorithms, practicalities[M]. 4th ed. New York:Academic Press, 2012.
[21] WILLSKY A S. Multiresolution Markov models for signal and image processing[J]. Proceedings of the IEEE, 2002, 90(8):1396-1458.
[22] LI S Z. Markov random field modeling in image analysis[M]. 3rd ed. London: Springer, 2009:357.
[23] Wikimedia Foundation, Inc. Kinect[EB/OL].(2015-10-11)[2015-11-10]. https://en.wikipedia.org/wiki/Kinect.
[24] NAGAI T, IKEHARA M, KUREMATSU A. HMM-based surface reconstruction from single images[J]. Systems and Computers in Japan, 2007, 38(11):80-89.
[25] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York, USA: ACM, 2001:341-346.
[1] 祝明,石承龙,吕潘,刘现荣,孙驰,陈建城,范宏运. 基于优化长短时记忆网络的深基坑变形预测方法及其工程应用[J]. 山东大学学报 (工学版), 2025, 55(3): 141-148.
[2] 鲁志恒,霍延强,韩汶,杜聪,刘轶鹏,张宏博. 基于图像数据和碎石集料级配与用量的碎石集料空隙率快速检测方法[J]. 山东大学学报 (工学版), 2024, 54(6): 89-99.
[3] 常新功,苏敏惠,周志刚. 基于进化集成的图神经网络解释方法[J]. 山东大学学报 (工学版), 2024, 54(4): 1-12.
[4] 乔慧妍,段学龙,解驰皓,赵冬慧,马玉玲. 基于异常点检测的心理健康辅助诊断方法[J]. 山东大学学报 (工学版), 2024, 54(4): 76-85.
[5] 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法[J]. 山东大学学报 (工学版), 2024, 54(3): 55-63.
[6] 岳仁峰,张嘉琦,刘勇,范学忠,李琮琮,孔令鑫. 基于颜色和纹理特征的立体车库锈蚀检测技术[J]. 山东大学学报 (工学版), 2024, 54(3): 64-69.
[7] 陈成,董永权,贾瑞,刘源. 基于交互序列特征相关性的可解释知识追踪[J]. 山东大学学报 (工学版), 2024, 54(1): 100-108.
[8] 卞小曼,王小琴,蓝如师,刘振丙,罗笑南. 基于相似性保持和判别性分析的快速视频哈希算法[J]. 山东大学学报 (工学版), 2023, 53(6): 63-69.
[9] 李鸿钊,张庆松,刘人太,陈新,辛勤,石乐乐. 浅埋地铁车站施工期地表变形风险预警[J]. 山东大学学报 (工学版), 2023, 53(6): 82-91.
[10] 袁高腾,周晓峰,郭宏乐. 基于特征选择算法的ECG信号分类[J]. 山东大学学报 (工学版), 2022, 52(4): 38-44.
[11] 韩天雨,路长厚,李建美,尹昂,侯秋林. 利用图像处理技术测量丝杠螺距的机器视觉系统[J]. 山东大学学报 (工学版), 2022, 52(3): 80-85.
[12] 聂秀山,马玉玲,乔慧妍,郭杰,崔超然,于志云,刘兴波,尹义龙. 任务粒度视角下的学生成绩预测研究综述[J]. 山东大学学报 (工学版), 2022, 52(2): 1-14.
[13] 王心哲,邓棋文,王际潮,范剑超. 深度语义分割MRF模型的海洋筏式养殖信息提取[J]. 山东大学学报 (工学版), 2022, 52(2): 89-98.
[14] 孙鸿昌,周风余,单明珠,翟文文,牛兰强. 基于模式划分的空调能耗混合填补方法[J]. 山东大学学报 (工学版), 2022, 52(1): 9-18.
[15] 宋怀雷, 邬忠虎, 李利平, 娄义黎, 孙文吉斌, 刘镐, 左宇军. 基于数字图像的微观尺度下方解石脉对页岩各向异性的影响[J]. 山东大学学报 (工学版), 2021, 51(5): 91-99.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 王素玉,艾兴,赵军,李作丽,刘增文 . 高速立铣3Cr2Mo模具钢切削力建模及预测[J]. 山东大学学报(工学版), 2006, 36(1): 1 -5 .
[2] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[3] 孔祥臻,刘延俊,王勇,赵秀华 . 气动比例阀的死区补偿与仿真[J]. 山东大学学报(工学版), 2006, 36(1): 99 -102 .
[4] 来翔 . 用胞映射方法讨论一类MKdV方程[J]. 山东大学学报(工学版), 2006, 36(1): 87 -92 .
[5] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .
[6] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[7] 王波,王宁生 . 机电装配体拆卸序列的自动生成及组合优化[J]. 山东大学学报(工学版), 2006, 36(2): 52 -57 .
[8] 李可,刘常春,李同磊 . 一种改进的最大互信息医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(2): 107 -110 .
[9] 季涛,高旭,孙同景,薛永端,徐丙垠 . 铁路10 kV自闭/贯通线路故障行波特征分析[J]. 山东大学学报(工学版), 2006, 36(2): 111 -116 .
[10] 浦剑1 ,张军平1 ,黄华2 . 超分辨率算法研究综述[J]. 山东大学学报(工学版), 2009, 39(1): 27 -32 .