山东大学学报(工学版) ›› 2018, Vol. 48 ›› Issue (2): 14-21.doi: 10.6040/j.issn.1672-3961.0.2017.431
史青宣1,王谦2,田学东1
SHI Qingxuan1, WANG Qian2, TIAN Xuedong1
摘要: 为解决单目视频中的人体姿态估计问题,从人体的部件模型出发,以人体部件轨迹片段为实体构建时空概率图模型,通过逐步缩减轨迹片段在时域上的覆盖度,形成多级层叠模型,采用迭代的时域和空域交替解析的策略,从完整轨迹的推理开始,逐级过滤状态空间,直至获取人体各部件在每帧图像中的最优状态。为提供高质量的状态候选,引入全局运动信息,将单帧图像中人体姿态检测结果传播到整个视频形成轨迹,构成原始状态空间。在3个数据集上的对比试验表明,该方法较其他视频人体姿态估计方法达到了更高的估计精度。
中图分类号:
[1] 李毅, 孙正兴, 陈松乐,等. 基于退火粒子群优化的单目视频人体姿态分析方法[J]. 自动化学报, 2012,38(5): 732-741. LI Yi, SUN Zhengxing, CHEN Songle, et al. 3D human pose analysis from monocular video by simulated annealed particle swarm optimization[J]. Acta Automatica Sinica, 2012, 38(5): 732-741. [2] 朱煜, 赵江坤, 王逸宁, 等. 基于深度学习的人体行为识别算法综述[J]. 自动化学报, 2016,42(6): 848-857. ZHU Yu, ZHAO Jiangkun, WANG Yining, et al. A review of human action recognition based on deep learning[J]. Acta Automatica Sinica, 2016, 42(6): 848-857. [3] SHOTTON J, GIRSHICK R, FITZGIBBON A, et al. Efficient human pose estimation from single depth images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2821-2840. [4] CRISTANI M, RAGHAVENDRA R, DEL BUE A, et al. Human behavior analysis in video surveillance: A social signal processing perspective[J]. Neurocomputing, 2013,100: 86-97. [5] WANG L M, QIAO Y, TANG X O. Video action detection with relational dynamic-poselets[C] //Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 565-580. [6] FELZENSZWALB P F, HUTTENLOCHER D P. Pictorial structures for object recognition[J]. International Journal of Computer Vision, 2005, 61(1): 55-79. [7] YANG Y, RAMANAN D. Articulated human detection with flexible mixtures of parts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(12): 2878-2890. [8] SAPP B, JORDAN C, TASKAR B. Adaptive pose priors for pictorial structures[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010: 422-429. [9] ANDRILUKA M, ROTH S, SCHIELE B. Pictorial structures revisited: People detection and articulated pose estimation[C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009: 1014-1021. [10] EICHNER M, MARIN-JIMENEZ M, ZISSERMAN A, et al. 2d articulated human pose estimation and retrieval in(almost)unconstrained still images[J]. International Journal of Computer Vision, 2012, 99(2): 190-214. [11] FERRARI V, MARIN-JIMENEZ M, ZISSERMAN A. Progressive search space reduction for human pose estimation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008: 1-8. [12] SHI Q X, DI H J, LU Y, et al. Human pose estimation with global motion cues[C] //Proceedings of the IEEE International Conference on Image Processing. Quebec, Canada: IEEE, 2015: 442-446. [13] SAPP B, WEISS D, TASKAR B. Parsing human motion with stretchable models[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011: 1281-1288. [14] ZHAO L, GAO X B, TAO D C, et al. Tracking human pose using max-margin markov models[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5274-5287. [15] RAMAKRISHNA V, KANADE T, SHEIKH Y. Tracking human pose by tracking symmetric parts[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013: 3728-3735. [16] CHERIAN A, MAIRAL J, ALAHARI K, et al. Mixing body-part sequences for human pose estimation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014: 2361-2368. [17] SIGAL L, BHATIA S, ROTH S, et al. Tracking loose-limbed people[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2004: 421-428. [18] SMINCHISESCU C, TRIGGS B. Estimating articulated human motion with covariance scaled sampling[J]. The International Journal of Robotics Research, 2003,22(6): 371-391. [19] WEISS D, SAPP B, TASKAR B. Sidestepping intractable inference with structured ensemble cascades[C] //Proceedings of Advances in Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2010: 2415-2423. [20] TOKOLA R, CHOI W, SAVARESE S. Breaking the chain: liberation from the temporal Markov assumption for tracking human poses[C] //Proceedings of the IEEE International Conference on Computer Vision. Sydney, Australia: IEEE, 2013: 2424-2431. [21] ZHANG D, SHAH M. Human pose estimation in videos[C] //Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 2012-2020. [22] SHI Q, DI H, LU Y, et al. Video pose estimation via medium granularity graphical model with spatial-temporal symmetric constraint part model[C] //Proceedings of IEEE International Conference on Image Processing. Phoenix, USA: IEEE, 2016:1299-1303. [23] SAPP B, TOSHEV A, TASKAR B. Cascaded models for articulated pose estimation[C] //Proceedings of European conference on computer vision. Hersonissos, Greece: Springer Berlin Heidelberg, 2010: 406-420. [24] TRAN D, WANG Y, FORSYTH D. Human parsing with a cascade of hierarchical poselet based pruners[C] //Proceedings of Multimedia and Expo(ICME), 2014 IEEE International Conference on. Chengdu, China: IEEE, 2014: 1-6. [25] GKIOXARI G, HARIHARAN B, GIRSHICH R, et al. Using k-poselets for detecting people and localizing their keypoints[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014: 3582-3589. [26] 吕峰, 邸慧军, 陆耀, 等. 基于分层弹性运动分析的非刚体跟踪方法[J]. 自动化学报, 2015,41(2): 295-303. LYU Feng, DI Huijun, LU Yao, et al. Non-rigid tracking method based on layered elastic motion analysis[J]. Acta Automatica Sinica, 2015, 41(2): 295-303. [27] DI H J, TAO L M, XU G Y. A mixture of transformed hidden Markov models for elastic motion estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(10): 1817-1830. [28] PARK D, RAMANAN D. N-best maximal decoders for part models[C] //Proceedings of the IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011: 2627-2634. [29] SHEN H Q, YU S I, YANG Y, et al. Unsupervised video adaptation for parsing human motion[C] //Proceedings of European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014: 347-360. [30] WANG C Y, WANG Y Z, YUILLE AL. An approach to pose-based action recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013: 915-922. [31] SAPP B, WEISS D, TASKAR B. Parsing human motion with stretchable models[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011: 1281-1288. |
[1] | 吴晨谋,方志军,黄正能. 基于单目摄像头的主动式驾驶行为分析算法[J]. 山东大学学报(工学版), 2018, 48(5): 69-76. |
[2] | 孟令恒,丁世飞. 基于单静态图像的深度感知模型[J]. 山东大学学报(工学版), 2016, 46(3): 37-43. |
[3] | 任永峰, 周静波. 基于信息弥散机制的图像显著性区域提取算法[J]. 山东大学学报(工学版), 2015, 45(6): 1-6. |
|