您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 51-57.doi: 10.6040/j.issn.1672-3961.2.2015.050

• • 上一篇    下一篇

基于马氏距离的分段矢量量化时间序列分类

陶志伟1,张莉1,2*   

  1. 1. 苏州大学计算机科学与技术学院, 江苏 苏州 215006;2. 江苏省计算机信息处理技术重点实验室, 江苏 苏州 215006
  • 收稿日期:2015-05-14 出版日期:2016-06-30 发布日期:2015-05-14
  • 通讯作者: 张莉(1975— ),女,江苏张家港人, 教授,博士,博士生导师,主要研究方向为机器学习、系统决策等.E-mail: zhangliml@suda.edu.cn E-mail:20144227051@stu.suda.edu.cn
  • 作者简介:陶志伟(1992— ),男,安徽马鞍山人,硕士研究生,主要研究方向为系统决策. E-mail: 20144227051@stu.suda.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61373093);江苏省自然科学基金资助项目(BK20140008,BK2012624);江苏省高校自然科学研究资助项目(13KJA520001);苏州大学大学生课外学术科研基金资助项目(KY2015546B)

Time series classification using piecewise vector quantized approximation based on Mahalanobis distance

TAO Zhiwei1, ZHANG Li1,2*   

  1. 1. School of Computer Science and Technology, Soochow University, Suzhou 215006, Jiangsu, China;
    2.Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou 215006, Jiangsu, China
  • Received:2015-05-14 Online:2016-06-30 Published:2015-05-14

摘要: 提出一种基于马氏距离的分段矢量量化时间序列分类(Mahalanobis distance-based time series classification using PVQA, MPVQA)算法。该算法在继承传统算法时间复杂度的基础上,引入马氏距离,克服了欧氏距离容易受模式特征量纲影响的缺点,提高了算法精度。首先,在训练时采用分段矢量量化近似方法获得码本,然后以马氏距离为相似性度量对时间序列进行分段重构。对重构后的时间序列,同样基于马氏距离为相似性度量进行判别。在4个时间序列数据集上进行的试验结果验证了所提方法在时间序列表示和分类上的优越性。

关键词: 分段矢量量化, 马氏距离, 重构, 特征量纲, 时间序列, 欧氏距离, 码本

Abstract: A Mahalanobis distance-based time series classification using PVQA(MPVQA)algorithm was developed. On the basis of inheriting the time complexity of the traditional algorithm and by exploiting Mahalanobis distance, the algorithm could easily overcome the default that the Euclidean distance was easily influenced by the mode characteristic dimension and improve the accuracy. PVQA was first used to generate a codebook using training samples, and then the Mahalanobis distance was taken as the measure of similarity and used to reconstruct time subsequences. For an unseen time series, the Mahalanobis distance was also adopted to find the most similar one to it. Experimental results on four time series datasets demonstrated that our method was more powerful to classify the time series.

Key words: time series, piecewise vector quantized approximation, reconstruct, Mahalanobis distance, codebook, characteristic dimension, Euclidean distance

中图分类号: 

  • TP302.7
[1] FALOUTSOS C, RANGANATHAN M, MANOLOPOULOS Y, et al. Fast subsequence matching in time-series databases[J]. ACM Sigmod Record, 2000, 23(2):419-429.
[2] CHAN K P, FU A W C. Efficient time series matching by wavelets[C] //Proceedings of IEEE International Conference on Data Engineering. Sydney, NSW:IEEE, 1999:126-133.
[3] CHAN F K P, FU A W, YU C. Haar wavelets for efficient similarity search of time-series: with and without time warping[J]. Knowledge and Data Engineering, IEEE Transactions on, 2003, 15(3):686-705.
[4] LIN J, KEOGH E, WEI L, et al. Experiencing SAX: a novel symbolic representation of time series[J]. Data Mining and knowledge discovery, 2007, 15(2):107-144.
[5] LI H, YANG L. Time series visualization based on shape features[J]. Knowledge-Based Systems, 2013, 41(1):43-53.
[6] KEOGH E, LIN J, FU A. Hot sax: efficiently finding the most unusual time series subsequence[J].Data Mining, 2005, 11(1):226-233.
[7] DELATHAUWER L, DE Moor B, VANDEWALLE J. A multilinear singular value decomposition[J]. SIAM journal on Matrix Analysis and Applications, 2000, 21(4):1253-1278.
[8] WANG Q, MEGALOOIKONOMOU V. A dimensionality reduction technique for efficient time series similarity analysis[J]. Information Systems, 2008, 33(1):115-132.
[9] LI H, YAND L, GUO C. Improved piecewise vector quantized approximation based on normalized time subsequences[J]. Measurement, 2013, 46(9):3429-3439.
[10] DING H, TRAJCEVSKI G, SCHEUERMANN P, et al. Querying and mining of time series data: experimental comparison of representations and distance measures[J]. Proceedings of the VLDB Endowment, 2008, 1(2):1542-1552.
[11] DONG Y, SUN Z, JIA H. A cosine similarity-based negative selection algorithm for time series novelty detection[J]. Mechanical Systems and Signal Processing, 2006, 20(6):1461-1472.
[12] DEMAESSCHALCK R, JOUAN-RIMBAUD D, MASSART D L. The Mahalanobis distance[J]. Chemometrics and Intelligent Laboratory Systems, 2000, 50(1):1-18.
[13] WANG X J, WANG L. Applications of Topsis improved based on Mahalanobis distance in supplier selection[J]. Control and Decision, 2012, 27(10):1566-1570.
[14] ZHU W D, HU J L. Sparse representation classification algorithm based on Mahalanobis distance[J]. Computer Technology and Development, 2011, 21(11):27-30.
[15] SASIKALA I S, BANU N. Privacy Preserving data mining using piecewise vector quantization(PVQ)[J]. International Journal of Advanced Research in Computer Science & Technology, 2014, 2(3):302-306.
[16] LU Z M, WANG J X, LIU B B. An improved lossless data hiding scheme based on image VQ-index residual value coding[J]. Journal of Systems and Software, 2009, 82(6):1016-1024.
[17] WANG Q, MEGALOOIKONOMOU V, FALOUTSOS C. Time series analysis with multiple resolutions[J]. Information Systems, 2010, 35(1):56-74.
[18] HUANG F, ZHOU J, LU X D. The simulation of one-dimensional range profile recognition based on Mahalanobisdistance[J]. Computer Simulation, 2010, 27(3):31-34.
[19] MEGALOOIKONOMOU V, LI G, WANG Q. A dimensionality reduction technique for efficient similarity analysis of time series databases[C] //Proceedings of the thirteenth ACM international conference on Information and knowledge management. New York, USA:ACM, 2004:160-161.
[20] KEOGH E. Welcome to the UCR time series classification/clustering page [DB/OL]. [2015-03-12]. http://www.cs.ucr.edu/~eamonn/time-series-data/.
[1] 鄢仁武,林剑雄,李培强,吴国耀,匡宇. 考虑碳排放因子与动态重构的主动配电网双层优化策略[J]. 山东大学学报 (工学版), 2025, 55(2): 16-27.
[2] 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法[J]. 山东大学学报 (工学版), 2024, 54(3): 55-63.
[3] 于艺旋,杨耕,耿华. 连续复合运动的多模态层次化关键帧提取方法[J]. 山东大学学报 (工学版), 2023, 53(2): 42-50.
[4] 刘丁菠,刘学艳,于东然,杨博,李伟. 面向小样本目标检测任务的自适应特征重构算法[J]. 山东大学学报 (工学版), 2022, 52(6): 115-122.
[5] 杨思,王艳,赵斌成,韩学山,刘冬,孙东磊. 含分布式电源的配电网三阶段协同优化调度[J]. 山东大学学报 (工学版), 2022, 52(5): 55-69.
[6] 姚元玺. 基于分场景重构的风电汇聚趋势性量化方法[J]. 山东大学学报 (工学版), 2019, 49(6): 86-92.
[7] 陈馨菂, 李天瑞, 杨欢欢. 基于时间序列数据的交互式主题河流可视化[J]. 山东大学学报 (工学版), 2019, 49(4): 29-35.
[8] 胡云,张舒,李慧,佘侃侃,施珺. 基于信任网络重构的推荐算法[J]. 山东大学学报 (工学版), 2019, 49(2): 42-46.
[9] 何文杰 ,何伟超,孙权森. 压缩感知重构算法的并行化及GPU加速[J]. 山东大学学报(工学版), 2018, 48(3): 110-114.
[10] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[11] 邓俊武,张玉民,张红娣,杜晓坤. X尾翼无人机的故障诊断和容错控制方法[J]. 山东大学学报(工学版), 2017, 47(5): 166-172.
[12] 王梦园,张雄,马亮,彭开香. 基于因果拓扑图的工业过程故障诊断[J]. 山东大学学报(工学版), 2017, 47(5): 187-194.
[13] 李明虎,李钢,钟麦英. 动态核主元分析在无人机故障诊断中的应用[J]. 山东大学学报(工学版), 2017, 47(5): 215-222.
[14] 刘卓,王天真,汤天浩,冯页帆,姚君琦,高迪驹. 一种多电平逆变器故障诊断与容错控制策略[J]. 山东大学学报(工学版), 2017, 47(5): 229-237.
[15] 张莉, 夏佩佩, 李凡长. 基于余弦相似性的供应商选择方法[J]. 山东大学学报(工学版), 2017, 47(1): 1-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[2] 来翔 . 用胞映射方法讨论一类MKdV方程[J]. 山东大学学报(工学版), 2006, 36(1): 87 -92 .
[3] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .
[4] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[5] 王波,王宁生 . 机电装配体拆卸序列的自动生成及组合优化[J]. 山东大学学报(工学版), 2006, 36(2): 52 -57 .
[6] 张英,郎咏梅,赵玉晓,张鉴达,乔鹏,李善评 . 由EGSB厌氧颗粒污泥培养好氧颗粒污泥的工艺探讨[J]. 山东大学学报(工学版), 2006, 36(4): 56 -59 .
[7] Yue Khing Toh1 , XIAO Wendong2 , XIE Lihua1 . 基于无线传感器网络的分散目标跟踪:实际测试平台的开发应用(英文)[J]. 山东大学学报(工学版), 2009, 39(1): 50 -56 .
[8] 刘忠国,张晓静,刘伯强,刘常春 . 视觉刺激间隔对大脑诱发电位的影响[J]. 山东大学学报(工学版), 2006, 36(3): 34 -38 .
[9] 孙炜伟,王玉振. 考虑饱和的发电机单机无穷大系统有限增益镇定[J]. 山东大学学报(工学版), 2009, 39(1): 69 -76 .
[10] 孙玉利,李法德,左敦稳,戚美 . 直立分室式流体连续通电加热系统的升温特性[J]. 山东大学学报(工学版), 2006, 36(6): 19 -23 .