山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 37-43.doi: 10.6040/j.issn.1672-3961.2.2015.007
孟令恒1,2,丁世飞1,2*
MENG Lingheng1,2, DING Shifei1,2*
摘要: 为解决立体视觉深度感知模型的昂贵计算代价问题,提出以机器学习算法为主要依托的基于单静态图像的深度感知模型。研究基于单静态图像的深度感知模型的形式化表示、多尺度空间图像特征的选择,并将该模型应用于深度图的预测,以及利用该模型预测到的深度图进行3D场景重构。试验结果表明,基于单静态图像的深度感知模型可以获得较好的深度预测精度、较快的预测速度以及比较理想的重构模型。
中图分类号:
[1] 李乐. 面向3DTV应用的视频2D转3D技术研究[D]. 长沙: 国防科学技术大学, 2012. LI Le. 2D to 3D conversion method for 3DTV application[J]. Changsha: National University of Defense Technology, 2012. [2] 李乐,张茂军,熊志辉,等. 基于内容理解的单幅静态街景图像深度估计[J]. 机器人, 2011(2):174-180. LI Le, ZHANG Maojun, XIONG Zhihui, et al. Depth estimation from a single still image of street scene based on content understanding[J].Robot, 2011(2):174-180. [3] HOIEM D, EFROS A A, HEBERT M. Putting objects in perspective[J]. International Journal of Computer Vision, 2008, 80(1):3-15. [4] HOIEM D, EFROS A, HEBERT M. Geometric context from a single image[C] //Proceedings of the Tenth IEEE International Conference on Computer Vision(ICCV'05).Beijing:IEEE, 2005:654-661. [5] GUO R, HOIEM D. Support surface prediction in indoor scenes[C] //Computer Vision and Pattern Recognition(CVPR).Sydney, NSW: IEEE, 2013:2144-2151. [6] SAXENA A, CHUNG S H, NG A Y. 3-D Depth reconstruction from a single still image[J]. International Journal of Computer Vision, 2008, 76(1):53-69. [7] KOPPULA H S, SAXENA A. Learning spatio-temporal structure from RGB-D videos for human activity detection and anticipation[C] //Proceedings of the 30th International Conference on Machine Learning(ICML-13).Atlanta, USA: JMLR, 2013:792-800. [8] SAXENA A, JAMIE S, NG A Y. Depth estimation using monocular and stereo cues [C] //International Joint Conferences on Artificial Intelligence. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc, 2007:2197-2203. [9] SAXENA A, SUN M, NG A Y. Make 3D: depth perception from a single still image[C] // Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, Illinois: AAAI Press, 2008:1571-1576. [10] FREUEH C, ZAKHOR A. Constructing 3D city models by merging ground-based and airborne views[J].IEEE Computer Graphics and Applications, 2003, 23(6):52-61. [11] OSWALD MRC. A convex relaxation approach to space time multi-view 3D reconstruction[C] //Computer Vision Workshops(ICCVW), 2013 IEEE International Conference. Sydney, NSW:IEEE, 2013:291-298. [12] CRIMINISI A. Single-view metrology[C] //Algorithms and Applications:Proceedings of the 24th DAGM Symposium on Pattern Recognition. London, UK: Springer-Verlag, 2002:16. [13] CRIMINISI A, REID I, ZISSERMAN A. Single view metrology[J]. International Journal of Computer Vision, 2000, 40(2):123-148. [14] MICHELS J, SAXENA A, NG A Y. High speed obstacle avoidance using monocular vision and reinforcement learning[C] // Proceedings of the 22nd International Conference on Machine Learning. New York, USA: ACM, 2005:593-600. [15] KOLLER D, FRIEDMAN N.Probabilistic graphical models: principles and techniques[M]. Cambridge, Massachusetts:MIT Press, 2009. [16] FELZENSZMALB P F, HUTTENLOCHER D P. Efficient graph-based image segmentation[J]. Int. J. Computer Vision, 2004, 59(2): 167-181. [17] CORMEN T H, STENIN C, RIVEST R L, et al.Introduction to algorithms[M]. 3 Edition.Cambridge MA:The MIT Press, 2009. [18] Wikimedia Foundation, Inc.YCbCr[EB/OL].(2015-10-29)[2015-11-10]. https://en.wikipedia.org/wiki/YCbCr. [19] 葛亮,朱庆生,傅思思. Laws纹理模板在立体匹配中的应用[J]. 光学学报, 2009, 29(9):2506-2510. GE Liang, ZHU Qingsheng, FU Sisi. Application of law's masks to stereo matching[J]. Acta Optica Sinica, 2009, 29(9):2506-2510. [20] DAVIES E R. Computer andmachine vision:theory, algorithms, practicalities[M]. 4th ed. New York:Academic Press, 2012. [21] WILLSKY A S. Multiresolution Markov models for signal and image processing[J]. Proceedings of the IEEE, 2002, 90(8):1396-1458. [22] LI S Z. Markov random field modeling in image analysis[M]. 3rd ed. London: Springer, 2009:357. [23] Wikimedia Foundation, Inc. Kinect[EB/OL].(2015-10-11)[2015-11-10]. https://en.wikipedia.org/wiki/Kinect. [24] NAGAI T, IKEHARA M, KUREMATSU A. HMM-based surface reconstruction from single images[J]. Systems and Computers in Japan, 2007, 38(11):80-89. [25] EFROS A A, FREEMAN W T. Image quilting for texture synthesis and transfer[C] //Proceedings of the 28th annual conference on Computer graphics and interactive techniques. New York, USA: ACM, 2001:341-346. |
[1] | 张冕,黄颖,梅海艺,郭毓. 基于Kinect的配电作业机器人智能人机交互方法[J]. 山东大学学报 (工学版), 2018, 48(5): 103-108. |
[2] | 张宪红,张春蕊. 基于六维前馈神经网络模型的图像增强算法[J]. 山东大学学报(工学版), 2018, 48(4): 10-19. |
[3] | 史青宣,王谦,田学东. 基于层叠的部件轨迹片段模型的视频人体姿态估计[J]. 山东大学学报(工学版), 2018, 48(2): 14-21. |
[4] | 刘洋,刘博,王峰. 基于Parameter Server框架的大数据挖掘优化算法[J]. 山东大学学报(工学版), 2017, 47(4): 1-6. |
[5] | 魏波,张文生,李元香,夏学文,吕敬钦. 一种选择特征的稀疏在线学习算法[J]. 山东大学学报(工学版), 2017, 47(1): 22-27. |
[6] | 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44. |
[7] | 刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83. |
[8] | 任永峰, 周静波. 基于信息弥散机制的图像显著性区域提取算法[J]. 山东大学学报(工学版), 2015, 45(6): 1-6. |
[9] | 郑毅, 朱成璋. 基于深度信念网络的PM2.5预测[J]. 山东大学学报(工学版), 2014, 44(6): 19-25. |
[10] | 邱晓欣1,2,张文强1,2*,秦晋贤1,2,杜正阳1,2,张德峰1,2. 恶劣环境下多目标实时跟踪算法研究[J]. 山东大学学报(工学版), 2014, 44(2): 21-27. |
[11] | 潘晟旻1,2,钟毅1*,王建华2. 基于改进Canny算子的坯料挤压变形边缘提取[J]. 山东大学学报(工学版), 2013, 43(5): 19-23. |
[12] | 谢琳1,殷熙尧2,李凡长3,吴佳3. 一种逆归结学习表示[J]. 山东大学学报(工学版), 2013, 43(4): 46-50. |
[13] | 徐姗姗,刘应安*,徐昇. 基于卷积神经网络的木材缺陷识别[J]. 山东大学学报(工学版), 2013, 43(2): 23-28. |
[14] | 何雪英1,2, 秦伟1, 尹义龙1*, 赵联征1,乔昊3. 基于机器学习的视频指纹识别[J]. 山东大学学报(工学版), 2011, 41(4): 29-33. |
[15] | 王玉英1,张西忠2,杨森2. 基于新型矢量排序的soft多结构形态学彩色图像处理[J]. 山东大学学报(工学版), 2011, 41(2): 18-22. |
|