基于单静态图像的深度感知模型

doi:10.6040/j.issn.1672-3961.2.2015.007

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 37-43.doi: 10.6040/j.issn.1672-3961.2.2015.007

基于单静态图像的深度感知模型

孟令恒^1,2,丁世飞^1,2*

1. 中国矿业大学计算机科学与技术学院, 江苏徐州 221116;2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190

收稿日期:2015-06-23 出版日期:2016-06-30 发布日期:2015-06-23
通讯作者: 丁世飞(1963— ),男,山东青岛人,博士,教授,主要研究方向为模式识别与人工智能,机器学习与数据挖掘,粗糙集与软计算,粒度计算,感知与认知计算.E-mail:dingsf@cumt.edu.cn E-mail:LinghengMeng@yahoo.com
作者简介:孟令恒(1991— ),男,山东滕州人,硕士研究生,主要研究方向为人工智能,机器学习,计算机视觉.E-mail:LinghengMeng@yahoo.com
基金资助:
国家自然科学基金资助项目(61379101);国家重点基础研究发展规划资助项目(2013CB329502)

Depth perceptual model based on the single image

MENG Lingheng^1,2, DING Shifei^1,2*

1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu, China;
2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Science, Beijing 100190, China

Received:2015-06-23 Online:2016-06-30 Published:2015-06-23

摘要/Abstract

摘要： 为解决立体视觉深度感知模型的昂贵计算代价问题,提出以机器学习算法为主要依托的基于单静态图像的深度感知模型。研究基于单静态图像的深度感知模型的形式化表示、多尺度空间图像特征的选择,并将该模型应用于深度图的预测,以及利用该模型预测到的深度图进行3D场景重构。试验结果表明,基于单静态图像的深度感知模型可以获得较好的深度预测精度、较快的预测速度以及比较理想的重构模型。

关键词: 深度感知, 图像处理, 3D重构, 机器学习, 马尔科夫随机场

Abstract: In order to overcome the expensive cost of stereo vision based on depth perceptual models, the single image based on depth perceptual model which mainly supported by machine learning algorithms was proposed. The formula presentation of the single image based on the depth perceptual model and the selecting of multi-scale image features was studied, and this model was used to predict depth image, furthermore the depth image was utilized to reconstruct the 3D scene. The experiments showed that single image based on depth perceptual model could make well predictive precise, faster predictive speed, and better reconstruction results.

Key words: machine learning, Markov random field, depth perception, 3D-reconstruction, image processing

中图分类号:

TP391

孟令恒,丁世飞. 基于单静态图像的深度感知模型[J]. 山东大学学报(工学版), 2016, 46(3): 37-43.

MENG Lingheng, DING Shifei. Depth perceptual model based on the single image[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(3): 37-43.

参考文献

[1] 李乐. 面向3DTV应用的视频2D转3D技术研究[D]. 长沙: 国防科学技术大学, 2012. LI Le. 2D to 3D conversion method for 3DTV application[J]. Changsha: National University of Defense Technology, 2012.
[2] 李乐,张茂军,熊志辉,等. 基于内容理解的单幅静态街景图像深度估计[J]. 机器人, 2011(2):174-180. LI Le, ZHANG Maojun, XIONG Zhihui, et al. Depth estimation from a single still image of street scene based on content understanding[J].Robot, 2011(2):174-180.
[3] HOIEM D, EFROS A A, HEBERT M. Putting objects in perspective[J]. International Journal of Computer Vision, 2008, 80(1):3-15.
[4] HOIEM D, EFROS A, HEBERT M. Geometric context from a single image[C] //Proceedings of the Tenth IEEE International Conference on Computer Vision(ICCV'05).Beijing:IEEE, 2005:654-661.
[5] GUO R, HOIEM D. Support surface prediction in indoor scenes[C] //Computer Vision and Pattern Recognition(CVPR).Sydney, NSW: IEEE, 2013:2144-2151.
[6] SAXENA A, CHUNG S H, NG A Y. 3-D Depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1):53-69.
[7] KOPPULA H S, SAXENA A. Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation[C] //Proceedings of the 30th International Conference on Machine Learning(ICML-13).Atlanta, USA: JMLR, 2013:792-800.
[8] SAXENA A, JAMIE S, NG A Y. Depth estimation using monocular and stereo cues [C] //International Joint Conferences on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 2007:2197-2203.
[9] SAXENA A, SUN M, NG A Y. Make 3D: depth perception from a single still image[C] // Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, Illinois: AAAI Press, 2008:1571-1576.
[10] FREUEH C, ZAKHOR A. Constructing 3D city models by merging ground-based and airborne views[J].IEEE Computer Graphics and Applications, 2003, 23(6):52-61.
[11] OSWALD MRC. A convex relaxation approach to space time multi-view 3D reconstruction[C] //Computer Vision Workshops(ICCVW), 2013 IEEE International Conference. Sydney, NSW:IEEE, 2013:291-298.
[12] CRIMINISI A. Single-view metrology[C] //Algorithms and Applications:Proceedings of the 24th DAGM Symposium on Pattern Recognition. London, UK: Springer-Verlag, 2002:16.
[13] CRIMINISI A, REID I, ZISSERMAN A. Single view metrology[J]. International Journal of Computer Vision, 2000, 40(2):123-148.
[14] MICHELS J, SAXENA A, NG A Y. High speed obstacle avoidance using monocular vision and reinforcement learning[C] // Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005:593-600.
[15] KOLLER D, FRIEDMAN N.Probabilistic graphical models: principles and techniques[M]. Cambridge, Massachusetts:MIT Press, 2009.
[16] FELZENSZMALB P F, HUTTENLOCHER D P. Efficient graph-based image segmentation[J]. Int. J. Computer Vision, 2004, 59(2): 167-181.
[17] CORMEN T H, STENIN C, RIVEST R L, et al.Introduction to algorithms[M]. 3 Edition.Cambridge MA:The MIT Press, 2009.
[18] Wikimedia Foundation, Inc.YCbCr[EB/OL].(2015-10-29)[2015-11-10]. https://en.wikipedia.org/wiki/YCbCr.
[19] 葛亮,朱庆生,傅思思. Laws纹理模板在立体匹配中的应用[J]. 光学学报, 2009, 29(9):2506-2510. GE Liang, ZHU Qingsheng, FU Sisi. Application of law's masks to stereo matching[J]. Acta Optica Sinica, 2009, 29(9):2506-2510.
[20] DAVIES E R. Computer andmachine vision:theory, algorithms, practicalities[M]. 4th ed. New York:Academic Press, 2012.
[21] WILLSKY A S. Multiresolution Markov models for signal and image processing[J]. Proceedings of the IEEE, 2002, 90(8):1396-1458.
[22] LI S Z. Markov random field modeling in image analysis[M]. 3rd ed. London: Springer, 2009:357.
[23] Wikimedia Foundation, Inc. Kinect[EB/OL].(2015-10-11)[2015-11-10]. https://en.wikipedia.org/wiki/Kinect.
[24] NAGAI T, IKEHARA M, KUREMATSU A. HMM-based surface reconstruction from single images[J]. Systems and Computers in Japan, 2007, 38(11):80-89.
[25] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York, USA: ACM, 2001:341-346.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于单静态图像的深度感知模型

Depth perceptual model based on the single image

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

多维度评价

本文评价

推荐阅读 10

[1]	张冕,黄颖,梅海艺,郭毓. 基于Kinect的配电作业机器人智能人机交互方法[J]. 山东大学学报 (工学版), 2018, 48(5): 103-108.
[2]	张宪红,张春蕊. 基于六维前馈神经网络模型的图像增强算法[J]. 山东大学学报(工学版), 2018, 48(4): 10-19.
[3]	史青宣,王谦,田学东. 基于层叠的部件轨迹片段模型的视频人体姿态估计[J]. 山东大学学报(工学版), 2018, 48(2): 14-21.
[4]	刘洋,刘博,王峰. 基于Parameter Server框架的大数据挖掘优化算法[J]. 山东大学学报(工学版), 2017, 47(4): 1-6.
[5]	魏波,张文生,李元香,夏学文,吕敬钦. 一种选择特征的稀疏在线学习算法[J]. 山东大学学报(工学版), 2017, 47(1): 22-27.
[6]	周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44.
[7]	刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM_2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83.
[8]	任永峰, 周静波. 基于信息弥散机制的图像显著性区域提取算法[J]. 山东大学学报(工学版), 2015, 45(6): 1-6.
[9]	郑毅, 朱成璋. 基于深度信念网络的PM_2.5预测[J]. 山东大学学报(工学版), 2014, 44(6): 19-25.
[10]	邱晓欣1,2,张文强1,2*,秦晋贤1,2,杜正阳1,2,张德峰1,2. 恶劣环境下多目标实时跟踪算法研究[J]. 山东大学学报(工学版), 2014, 44(2): 21-27.
[11]	潘晟旻1,2,钟毅1*,王建华2. 基于改进Canny算子的坯料挤压变形边缘提取[J]. 山东大学学报(工学版), 2013, 43(5): 19-23.
[12]	谢琳1,殷熙尧2,李凡长3,吴佳3. 一种逆归结学习表示[J]. 山东大学学报(工学版), 2013, 43(4): 46-50.
[13]	徐姗姗,刘应安*,徐昇. 基于卷积神经网络的木材缺陷识别[J]. 山东大学学报(工学版), 2013, 43(2): 23-28.
[14]	何雪英1,2, 秦伟1, 尹义龙1*, 赵联征1,乔昊3. 基于机器学习的视频指纹识别[J]. 山东大学学报(工学版), 2011, 41(4): 29-33.
[15]	王玉英1,张西忠2,杨森2. 基于新型矢量排序的soft多结构形态学彩色图像处理[J]. 山东大学学报(工学版), 2011, 41(2): 18-22.