您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2023, Vol. 53 ›› Issue (2): 42-50.doi: 10.6040/j.issn.1672-3961.0.2022.131

• • 上一篇    下一篇

连续复合运动的多模态层次化关键帧提取方法

于艺旋,杨耕*,耿华   

  1. 清华大学自动化系, 北京 100084
  • 收稿日期:2022-04-11 出版日期:2023-04-22 发布日期:2023-04-21
  • 作者简介:于艺旋(1997— ),女,辽宁营口人,硕士研究生,主要研究方向为人体运动分析、计算机视觉. E-mail: yuyx19@mails.tsinghua.edu.cn. *通信作者简介:杨耕(1957— ),男,四川青川人,研究员,博士,主要研究方向为电机控制系统、电力电子系统控制技术、可再生能源系统的控制和优化技术、动力电池的老化模型和AI技术在上述领域的应用. E-mail:yanggeng@tsinghua.edu.cn

Multimodal hierarchical keyframe extraction method for continuous combined motion

YU Yixuan, YANG Geng*, GENG Hua   

  1. Department of Automation, Tsinghua University, Beijing 100084, China
  • Received:2022-04-11 Online:2023-04-22 Published:2023-04-21

摘要: 针对连续复合运动的关键帧对应的空间范围差异较大且存在重复,难以采用固定的空间特征标准提取的问题,提出一种基于多模态分段与聚类的层次化关键帧提取方法。在完整运动层面按照背景音乐节拍与时空信息等多模态信息将运动序列分割为多个片段;对各片段内部的帧进行空间特征聚类与时序分割,得到若干具有代表性的、姿势可能重复的候选关键帧;根据运动的时空特性消除冗余。以广播体操运动为例提取关键帧并与现有方法进行对比试验与分析,本研究方法能够更加准确、充分地提取运动的关键帧。

关键词: 关键帧提取, 连续复合运动, 层次化, 多模态, 时间序列分割

中图分类号: 

  • TP391
[1] TRUONG B T, VENKATESH S. Video abstraction: a systematic review and classification[J]. ACM Transactions on Multimedia Computing, Communications, Applications, 2007, 3(1): 3-11.
[2] SUN B, KONG D, WANG S, et al. Keyframe extraction for human motion capture data based on affinity propagation[C] //Proceedings of the 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference. Piscataway, USA: IEEE, 2018: 107-112.
[3] HAN F, REILY B, HOFF W, et al. Space-time representation of people based on 3D skeletal data: a review[J]. Computer Vision Image Understanding, 2017, 158(2): 85-105.
[4] ZHOU F, DE F, HODGINS J. Hierarchical aligned cluster analysis for temporal clustering of human motion[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2012, 35(3): 582-596.
[5] WANG P, YUAN C, HU W, et al. Graph based skeleton motion representation and similarity measurement for action recognition[C] //Proceedings of the European Conference on Computer Vision. Piscataway, USA: Springer, 2016: 370-385.
[6] WENG J, WENG C, YUAN J, et al. Discriminative spatio-temporal pattern discovery for 3D action recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 29(4): 1077-1089.
[7] ZHANG P, LAN C, ZENG W, et al. Semantics-guided neural networks for efficient skeleton-based human action recognition [C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2020: 1112-1121.
[8] ZHANG Z. Microsoft kinect sensor and its effect[J]. IEEE Multimedia, 2012, 19(2): 4-10.
[9] 姚桐. 视频语义检测关键帧提取算法研究[D]. 西安: 中国科学院西安光学精密机械研究所, 2018. YAO Tong. Research on the key frames extraction algorithm on video semantic detection[D]. Xi'an: Xi'an Institute of Optics and Precision Mechanics of CAS, 2018.
[10] LIM I, THALMANN D. Key-posture extraction out of human motion data[C] //Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Piscataway, USA: IEEE, 2001: 1167-1169.
[11] HALIT C, CAPIN T. Multiscale motion saliency for keyframe extraction from motion capture sequences[J]. Computer Animation and Worlds Virtual, 2011, 22(1): 3-14.
[12] 杨涛, 肖俊, 吴飞.基于分层曲线简化的运动捕获数据关键帧提取[J].计算机辅助设计与图形学学报, 2006, 18(11): 1691-1697. YANG Tao, XIAO Jun, WU Fei, et al. Extraction of keyframe of motion capture data based on layered curve simplification[J]. Journal of Computer-Aided Design & Computer Graphics, 2006, 18(11): 1691-1697.
[13] 文雪琴.太极拳视频的配准研究[D].湘潭: 湘潭大学, 2019. WEN Xueqin. Research on the registration of Tai Chi video clips [D]. Xiangtan: Xiangtan University, 2019.
[14] 沈军行, 孙守迁, 潘云鹤.从运动捕获数据中提取关键帧[J].计算机辅助设计与图形学学报, 2004, 16(5): 719-723. SHEN Junxing, SUN Shouqian, PAN Yunhe. Key-frame extraction from motion capture data[J]. Journal of Computer-Aided Design & Computer Graphics, 2004, 16(5): 719-723.
[15] LIU X, HAO A, ZHAO D. Optimization-based key frame extraction for motion capture animation[J]. The Visual Computer, 2013, 29(1): 85-95.
[16] XIA G, SUN H, NIU X, et al. Keyframe extraction for human motion capture data based on joint kernel sparse representation[J]. IEEE Transactions on Industrial Electronics, 2016, 64(2): 1589-1599.
[17] TANG Y, TIAN Y, LU J, et al. Deep progressive reinforcement learning for skeleton-based action recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2018: 5323-5332.
[18] 蔡美玲, 邹北骥, 辛国江. 预选策略和重建误差优化的运动捕获数据关键帧提取[J].计算机辅助设计与图形学学报, 2012, 24(11): 1485-1492. CAI Meiling, ZOU Beiji, XIN Guojiang. Extraction of key-frame motion capture data based on pre-selection and reconstruction error optimization[J]. Journal of Computer-Aided Design & Computer Graphics, 2012, 24(11): 1485-1492.
[19] MO C, HU K, MEI S, et al. Keyframe extraction from motion capture sequences with graph based deep reinforcement learning[C] //Proceedings of the 29th ACM International Conference on Multimedia. New York, USA: ACM, 2021: 5194-5202.
[20] COOPER M, FOOTE J. Summarizing video using non-negative similarity matrix factorization[C] //Proceedings of the 2002 IEEE Workshop on Multimedia Signal Processing. Piscataway, USA: IEEE, 2002: 25-28.
[21] HUANG K, CHANG C, HSU Y, et al. Key probe: a technique for animation keyframe extraction[J]. The Visual Computer, 2005, 21(8): 532-541.
[22] ZHANG Q, YU S, ZHOU D, et al. An efficient method of key-frame extraction based on a cluster algorithm[J]. Journal of Human Kinetics, 2013, 39(3): 5-14.
[23] VOULODIMOS A, RALLIS I, DOULAMIS N. Physics-based keyframe selection for human motion summarization[J]. Multimedia Tools and Applications, 2020, 79(5): 3243-3259.
[24] LIU H, HAO H. Key frame extraction based on improved hierarchical clustering algorithm[C] //Proceedings of the 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery. Piscataway, USA: IEEE, 2014: 793-797.
[25] KITSIKIDIS A, DIMITROPOULOS K, DOUKA S, et al. Dance analysis using multiple kinect sensors[C] //Proceedings of the 2014 International Conference on Computer Vision Theory and Applications. Piscataway, USA: IEEE, 2014: 789-795.
[26] 季月鹏.基于视频人体姿态估计的高尔夫挥杆动作比对分析研究[D].南京: 南京邮电大学, 2019. JI Yuepeng. Comparative analysis of golf swing based on video human pose estimation[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2019.
[27] ZHOU Y, HABERMANN M, HABIBIE I, et al. Monocular real-time full body capture with inter-part correlations computer vision and pattern recognition[C] //Proceedings of the 34th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2021: 795-806.
[28] PHAM H H, SALMANE H, KHOUDOUR L, et al. A unified deep framework for joint 3D pose estimation and action recognition from a single rgb camera[J]. Sensors, 2020, 20(7): 1825-1839.
[29] LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+D 120: a large-scale benchmark for 3D human activity understanding[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(10): 2684-2701.
[30] HU J, ZHENG W, LAI J, et al. Jointly learning heterogeneous features for RGB-D activity recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2015: 5344-5352.
[31] MÜLLER M, RÖDER T, CLAUSEN M, et al. Documentation mocap database hdm05[J]. Citeseer, 2007, 14(1): 26-40.
[32] 国家体育总局. 第九套广播体操手册[M]. 北京: 人民体育出版社, 2011.
[33] BÖCK S, KORZENIOWSKI F, SCHLÜTER J, et al. Madmom: a new python audio and music signal processing library[C] //Proceedings of the 24th ACM International Conference on Multimedia. New York, USA: ACM, 2016: 1174-1178.
[1] 杨霄,袭肖明,李维翠,杨璐. 基于层次化双重注意力网络的乳腺多模态图像分类[J]. 山东大学学报 (工学版), 2022, 52(3): 34-41.
[2] 霍兵强,周涛,陆惠玲,董雅丽,刘珊. 基于NRC和多模态残差神经网络的肺部肿瘤良恶性分类[J]. 山东大学学报 (工学版), 2020, 50(6): 59-67.
[3] 田枫,李欣,刘芳,李闯,孙小强,杜睿山. 基于多模态子空间学习的语义标签生成方法[J]. 山东大学学报 (工学版), 2020, 50(3): 31-37, 44.
[4] 李秋玲,邵宝民,赵磊,王振,姜雪. 基于ViBe算法运动特征的关键帧提取算法[J]. 山东大学学报 (工学版), 2020, 50(1): 8-13.
[5] 常致富,周风余,王玉刚,沈冬冬,赵阳. 基于深度学习的图像自动标注方法综述[J]. 山东大学学报 (工学版), 2019, 49(6): 25-35.
[6] 陈成军,周以齐,杨红娟 . 基于SolidWorks模型的虚拟装配模型转换和表达方法[J]. 山东大学学报(工学版), 2008, 38(1): 61-65 .
[7] 翟海亭,吴晓娟,彭彰 . 一种改进的基于互信息的三维医学图像配准算法[J]. 山东大学学报(工学版), 2006, 36(4): 33-39 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 贾超,赵建宇,徐帮树,岳长城,李树忱 . 清水隧道围岩软土振动液化研究[J]. 山东大学学报(工学版), 2008, 38(1): 83 -87 .
[2] 齐辉 商庆森 朱海波 崔新壮 刘超.
采空区公路沥青路面结构层应力关键影响因素分析
[J]. 山东大学学报(工学版), 2009, 39(6): 121 -124 .
[3] 贠汝安1,2,董增川1,王好芳2. 基于NSGA2的水库多目标优化[J]. 山东大学学报(工学版), 2010, 40(6): 124 -128 .
[4] 公绪春1,刘宝1*,于丽丽1,窦珍伟2. 低温水热法合成均匀单分散ZnS微球[J]. 山东大学学报(工学版), 2011, 41(1): 110 -113 .
[5] 廉根宽1,田茂诚2,冷学礼2. 恒热流条件下振动圆管外结垢特性实验[J]. 山东大学学报(工学版), 2012, 42(2): 97 -101 .
[6] 侯和涛1,吴明磊1*,邱灿星1,王静峰2. 钢框架与节能复合墙板连接方式的试验研究[J]. 山东大学学报(工学版), 2012, 42(3): 73 -80 .
[7] 田枫,刘卓炫,尚福华,沈旭昆,王梅,王浩畅. 基于语境相关图传播的图像标注改善方法[J]. 山东大学学报(工学版), 2016, 46(5): 1 -6 .
[8] 周轮,李术才,许振浩,李利平,黄鑫,何树江,李国豪. 隧道综合超前地质预报技术及其工程应用[J]. 山东大学学报(工学版), 2017, 47(2): 55 -62 .
[9] 解静, 考永贵, 高存臣, 张孟乔. 变时滞不确定广义Markovian跳系统的滑模控制[J]. 山东大学学报(工学版), 2014, 44(4): 31 -38 .
[10] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .