您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 37-43.doi: 10.6040/j.issn.1672-3961.2.2015.007

• • 上一篇    下一篇

基于单静态图像的深度感知模型

孟令恒1,2,丁世飞1,2*   

  1. 1. 中国矿业大学计算机科学与技术学院, 江苏 徐州 221116;2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190
  • 收稿日期:2015-06-23 出版日期:2016-06-30 发布日期:2015-06-23
  • 通讯作者: 丁世飞(1963— ),男,山东青岛人,博士,教授,主要研究方向为模式识别与人工智能,机器学习与数据挖掘,粗糙集与软计算,粒度计算,感知与认知计算.E-mail:dingsf@cumt.edu.cn E-mail:LinghengMeng@yahoo.com
  • 作者简介:孟令恒(1991— ),男,山东滕州人,硕士研究生,主要研究方向为人工智能,机器学习,计算机视觉.E-mail:LinghengMeng@yahoo.com
  • 基金资助:
    国家自然科学基金资助项目(61379101);国家重点基础研究发展规划资助项目(2013CB329502)

Depth perceptual model based on the single image

MENG Lingheng1,2, DING Shifei1,2*   

  1. 1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China;
    2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China
  • Received:2015-06-23 Online:2016-06-30 Published:2015-06-23

摘要: 为解决立体视觉深度感知模型的昂贵计算代价问题,提出以机器学习算法为主要依托的基于单静态图像的深度感知模型。研究基于单静态图像的深度感知模型的形式化表示、多尺度空间图像特征的选择,并将该模型应用于深度图的预测,以及利用该模型预测到的深度图进行3D场景重构。试验结果表明,基于单静态图像的深度感知模型可以获得较好的深度预测精度、较快的预测速度以及比较理想的重构模型。

关键词: 深度感知, 图像处理, 3D重构, 机器学习, 马尔科夫随机场

Abstract: In order to overcome the expensive cost of stereo vision based on depth perceptual models, the single image based on depth perceptual model which mainly supported by machine learning algorithms was proposed. The formula presentation of the single image based on the depth perceptual model and the selecting of multi-scale image features was studied, and this model was used to predict depth image, furthermore the depth image was utilized to reconstruct the 3D scene. The experiments showed that single image based on depth perceptual model could make well predictive precise, faster predictive speed, and better reconstruction results.

Key words: machine learning, Markov random field, depth perception, 3D-reconstruction, image processing

中图分类号: 

  • TP391
[1] 李乐. 面向3DTV应用的视频2D转3D技术研究[D]. 长沙: 国防科学技术大学, 2012. LI Le. 2D to 3D conversion method for 3DTV application[J]. Changsha: National University of Defense Technology, 2012.
[2] 李乐,张茂军,熊志辉,等. 基于内容理解的单幅静态街景图像深度估计[J]. 机器人, 2011(2):174-180. LI Le, ZHANG Maojun, XIONG Zhihui, et al. Depth estimation from a single still image of street scene based on content understanding[J].Robot, 2011(2):174-180.
[3] HOIEM D, EFROS A A, HEBERT M. Putting objects in perspective[J]. International Journal of Computer Vision, 2008, 80(1):3-15.
[4] HOIEM D, EFROS A, HEBERT M. Geometric context from a single image[C] //Proceedings of the Tenth IEEE International Conference on Computer Vision(ICCV'05).Beijing:IEEE, 2005:654-661.
[5] GUO R, HOIEM D. Support surface prediction in indoor scenes[C] //Computer Vision and Pattern Recognition(CVPR).Sydney, NSW: IEEE, 2013:2144-2151.
[6] SAXENA A, CHUNG S H, NG A Y. 3-D Depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1):53-69.
[7] KOPPULA H S, SAXENA A. Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation[C] //Proceedings of the 30th International Conference on Machine Learning(ICML-13).Atlanta, USA: JMLR, 2013:792-800.
[8] SAXENA A, JAMIE S, NG A Y. Depth estimation using monocular and stereo cues [C] //International Joint Conferences on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 2007:2197-2203.
[9] SAXENA A, SUN M, NG A Y. Make 3D: depth perception from a single still image[C] // Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, Illinois: AAAI Press, 2008:1571-1576.
[10] FREUEH C, ZAKHOR A. Constructing 3D city models by merging ground-based and airborne views[J].IEEE Computer Graphics and Applications, 2003, 23(6):52-61.
[11] OSWALD MRC. A convex relaxation approach to space time multi-view 3D reconstruction[C] //Computer Vision Workshops(ICCVW), 2013 IEEE International Conference. Sydney, NSW:IEEE, 2013:291-298.
[12] CRIMINISI A. Single-view metrology[C] //Algorithms and Applications:Proceedings of the 24th DAGM Symposium on Pattern Recognition. London, UK: Springer-Verlag, 2002:16.
[13] CRIMINISI A, REID I, ZISSERMAN A. Single view metrology[J]. International Journal of Computer Vision, 2000, 40(2):123-148.
[14] MICHELS J, SAXENA A, NG A Y. High speed obstacle avoidance using monocular vision and reinforcement learning[C] // Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005:593-600.
[15] KOLLER D, FRIEDMAN N.Probabilistic graphical models: principles and techniques[M]. Cambridge, Massachusetts:MIT Press, 2009.
[16] FELZENSZMALB P F, HUTTENLOCHER D P. Efficient graph-based image segmentation[J]. Int. J. Computer Vision, 2004, 59(2): 167-181.
[17] CORMEN T H, STENIN C, RIVEST R L, et al.Introduction to algorithms[M]. 3 Edition.Cambridge MA:The MIT Press, 2009.
[18] Wikimedia Foundation, Inc.YCbCr[EB/OL].(2015-10-29)[2015-11-10]. https://en.wikipedia.org/wiki/YCbCr.
[19] 葛亮,朱庆生,傅思思. Laws纹理模板在立体匹配中的应用[J]. 光学学报, 2009, 29(9):2506-2510. GE Liang, ZHU Qingsheng, FU Sisi. Application of law's masks to stereo matching[J]. Acta Optica Sinica, 2009, 29(9):2506-2510.
[20] DAVIES E R. Computer andmachine vision:theory, algorithms, practicalities[M]. 4th ed. New York:Academic Press, 2012.
[21] WILLSKY A S. Multiresolution Markov models for signal and image processing[J]. Proceedings of the IEEE, 2002, 90(8):1396-1458.
[22] LI S Z. Markov random field modeling in image analysis[M]. 3rd ed. London: Springer, 2009:357.
[23] Wikimedia Foundation, Inc. Kinect[EB/OL].(2015-10-11)[2015-11-10]. https://en.wikipedia.org/wiki/Kinect.
[24] NAGAI T, IKEHARA M, KUREMATSU A. HMM-based surface reconstruction from single images[J]. Systems and Computers in Japan, 2007, 38(11):80-89.
[25] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York, USA: ACM, 2001:341-346.
[1] 张冕,黄颖,梅海艺,郭毓. 基于Kinect的配电作业机器人智能人机交互方法[J]. 山东大学学报 (工学版), 2018, 48(5): 103-108.
[2] 张宪红,张春蕊. 基于六维前馈神经网络模型的图像增强算法[J]. 山东大学学报(工学版), 2018, 48(4): 10-19.
[3] 史青宣,王谦,田学东. 基于层叠的部件轨迹片段模型的视频人体姿态估计[J]. 山东大学学报(工学版), 2018, 48(2): 14-21.
[4] 刘洋,刘博,王峰. 基于Parameter Server框架的大数据挖掘优化算法[J]. 山东大学学报(工学版), 2017, 47(4): 1-6.
[5] 魏波,张文生,李元香,夏学文,吕敬钦. 一种选择特征的稀疏在线学习算法[J]. 山东大学学报(工学版), 2017, 47(1): 22-27.
[6] 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44.
[7] 刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83.
[8] 任永峰, 周静波. 基于信息弥散机制的图像显著性区域提取算法[J]. 山东大学学报(工学版), 2015, 45(6): 1-6.
[9] 郑毅, 朱成璋. 基于深度信念网络的PM2.5预测[J]. 山东大学学报(工学版), 2014, 44(6): 19-25.
[10] 邱晓欣1,2,张文强1,2*,秦晋贤1,2,杜正阳1,2,张德峰1,2. 恶劣环境下多目标实时跟踪算法研究[J]. 山东大学学报(工学版), 2014, 44(2): 21-27.
[11] 潘晟旻1,2,钟毅1*,王建华2. 基于改进Canny算子的坯料挤压变形边缘提取[J]. 山东大学学报(工学版), 2013, 43(5): 19-23.
[12] 谢琳1,殷熙尧2,李凡长3,吴佳3. 一种逆归结学习表示[J]. 山东大学学报(工学版), 2013, 43(4): 46-50.
[13] 徐姗姗,刘应安*,徐昇. 基于卷积神经网络的木材缺陷识别[J]. 山东大学学报(工学版), 2013, 43(2): 23-28.
[14] 何雪英1,2, 秦伟1, 尹义龙1*, 赵联征1,乔昊3. 基于机器学习的视频指纹识别[J]. 山东大学学报(工学版), 2011, 41(4): 29-33.
[15] 王玉英1,张西忠2,杨森2. 基于新型矢量排序的soft多结构形态学彩色图像处理[J]. 山东大学学报(工学版), 2011, 41(2): 18-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[2] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[3] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[4] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[5] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[6] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[7] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[8] 赵科军 王新军 刘洋 仇一泓. 基于结构化覆盖网的连续 top-k 联接查询算法[J]. 山东大学学报(工学版), 2009, 39(5): 32 -37 .
[9] 赵治广,王登杰,田云飞 . 基于灰色理论的路基沉降研究[J]. 山东大学学报(工学版), 2007, 37(3): 86 -88 .
[10] 姚占勇,商庆森,赵之仲,贾朝霞 . 界面条件对半刚性沥青路面结构应力分布的影响[J]. 山东大学学报(工学版), 2007, 37(3): 93 -99 .