您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (3): 51-57.doi: 10.6040/j.issn.1672-3961.2.2015.050

• • 上一篇    下一篇

基于马氏距离的分段矢量量化时间序列分类

陶志伟1,张莉1,2*   

  1. 1. 苏州大学计算机科学与技术学院, 江苏 苏州 215006;2. 江苏省计算机信息处理技术重点实验室, 江苏 苏州 215006
  • 收稿日期:2015-05-14 出版日期:2016-06-30 发布日期:2015-05-14
  • 通讯作者: 张莉(1975— ),女,江苏张家港人, 教授,博士,博士生导师,主要研究方向为机器学习、系统决策等.E-mail: zhangliml@suda.edu.cn E-mail:20144227051@stu.suda.edu.cn
  • 作者简介:陶志伟(1992— ),男,安徽马鞍山人,硕士研究生,主要研究方向为系统决策. E-mail: 20144227051@stu.suda.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61373093);江苏省自然科学基金资助项目(BK20140008,BK2012624);江苏省高校自然科学研究资助项目(13KJA520001);苏州大学大学生课外学术科研基金资助项目(KY2015546B)

Time series classification using piecewise vector quantized approximation based on Mahalanobis distance

TAO Zhiwei1, ZHANG Li1,2*   

  1. 1. School of Computer Science and Technology, Soochow University, Suzhou 215006, Jiangsu, China;
    2.Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou 215006, Jiangsu, China
  • Received:2015-05-14 Online:2016-06-30 Published:2015-05-14

摘要: 提出一种基于马氏距离的分段矢量量化时间序列分类(Mahalanobis distance-based time series classification using PVQA, MPVQA)算法。该算法在继承传统算法时间复杂度的基础上,引入马氏距离,克服了欧氏距离容易受模式特征量纲影响的缺点,提高了算法精度。首先,在训练时采用分段矢量量化近似方法获得码本,然后以马氏距离为相似性度量对时间序列进行分段重构。对重构后的时间序列,同样基于马氏距离为相似性度量进行判别。在4个时间序列数据集上进行的试验结果验证了所提方法在时间序列表示和分类上的优越性。

关键词: 分段矢量量化, 马氏距离, 重构, 特征量纲, 时间序列, 欧氏距离, 码本

Abstract: A Mahalanobis distance-based time series classification using PVQA(MPVQA)algorithm was developed. On the basis of inheriting the time complexity of the traditional algorithm and by exploiting Mahalanobis distance, the algorithm could easily overcome the default that the Euclidean distance was easily influenced by the mode characteristic dimension and improve the accuracy. PVQA was first used to generate a codebook using training samples, and then the Mahalanobis distance was taken as the measure of similarity and used to reconstruct time subsequences. For an unseen time series, the Mahalanobis distance was also adopted to find the most similar one to it. Experimental results on four time series datasets demonstrated that our method was more powerful to classify the time series.

Key words: time series, piecewise vector quantized approximation, reconstruct, Mahalanobis distance, codebook, characteristic dimension, Euclidean distance

中图分类号: 

  • TP302.7
[1] FALOUTSOS C, RANGANATHAN M, MANOLOPOULOS Y, et al. Fast subsequence matching in time-series databases[J]. ACM Sigmod Record, 2000, 23(2):419-429.
[2] CHAN K P, FU A W C. Efficient time series matching by wavelets[C] //Proceedings of IEEE International Conference on Data Engineering. Sydney, NSW:IEEE, 1999:126-133.
[3] CHAN F K P, FU A W, YU C. Haar wavelets for efficient similarity search of time-series: with and without time warping[J]. Knowledge and Data Engineering, IEEE Transactions on, 2003, 15(3):686-705.
[4] LIN J, KEOGH E, WEI L, et al. Experiencing SAX: a novel symbolic representation of time series[J]. Data Mining and knowledge discovery, 2007, 15(2):107-144.
[5] LI H, YANG L. Time series visualization based on shape features[J]. Knowledge-Based Systems, 2013, 41(1):43-53.
[6] KEOGH E, LIN J, FU A. Hot sax: efficiently finding the most unusual time series subsequence[J].Data Mining, 2005, 11(1):226-233.
[7] DELATHAUWER L, DE Moor B, VANDEWALLE J. A multilinear singular value decomposition[J]. SIAM journal on Matrix Analysis and Applications, 2000, 21(4):1253-1278.
[8] WANG Q, MEGALOOIKONOMOU V. A dimensionality reduction technique for efficient time series similarity analysis[J]. Information Systems, 2008, 33(1):115-132.
[9] LI H, YAND L, GUO C. Improved piecewise vector quantized approximation based on normalized time subsequences[J]. Measurement, 2013, 46(9):3429-3439.
[10] DING H, TRAJCEVSKI G, SCHEUERMANN P, et al. Querying and mining of time series data: experimental comparison of representations and distance measures[J]. Proceedings of the VLDB Endowment, 2008, 1(2):1542-1552.
[11] DONG Y, SUN Z, JIA H. A cosine similarity-based negative selection algorithm for time series novelty detection[J]. Mechanical Systems and Signal Processing, 2006, 20(6):1461-1472.
[12] DEMAESSCHALCK R, JOUAN-RIMBAUD D, MASSART D L. The Mahalanobis distance[J]. Chemometrics and Intelligent Laboratory Systems, 2000, 50(1):1-18.
[13] WANG X J, WANG L. Applications of Topsis improved based on Mahalanobis distance in supplier selection[J]. Control and Decision, 2012, 27(10):1566-1570.
[14] ZHU W D, HU J L. Sparse representation classification algorithm based on Mahalanobis distance[J]. Computer Technology and Development, 2011, 21(11):27-30.
[15] SASIKALA I S, BANU N. Privacy Preserving data mining using piecewise vector quantization(PVQ)[J]. International Journal of Advanced Research in Computer Science & Technology, 2014, 2(3):302-306.
[16] LU Z M, WANG J X, LIU B B. An improved lossless data hiding scheme based on image VQ-index residual value coding[J]. Journal of Systems and Software, 2009, 82(6):1016-1024.
[17] WANG Q, MEGALOOIKONOMOU V, FALOUTSOS C. Time series analysis with multiple resolutions[J]. Information Systems, 2010, 35(1):56-74.
[18] HUANG F, ZHOU J, LU X D. The simulation of one-dimensional range profile recognition based on Mahalanobisdistance[J]. Computer Simulation, 2010, 27(3):31-34.
[19] MEGALOOIKONOMOU V, LI G, WANG Q. A dimensionality reduction technique for efficient similarity analysis of time series databases[C] //Proceedings of the thirteenth ACM international conference on Information and knowledge management. New York, USA:ACM, 2004:160-161.
[20] KEOGH E. Welcome to the UCR time series classification/clustering page [DB/OL]. [2015-03-12]. http://www.cs.ucr.edu/~eamonn/time-series-data/.
[1] 何文杰 ,何伟超,孙权森. 压缩感知重构算法的并行化及GPU加速[J]. 山东大学学报(工学版), 2018, 48(3): 110-114.
[2] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[3] 邓俊武,张玉民,张红娣,杜晓坤. X尾翼无人机的故障诊断和容错控制方法[J]. 山东大学学报(工学版), 2017, 47(5): 166-172.
[4] 王梦园,张雄,马亮,彭开香. 基于因果拓扑图的工业过程故障诊断[J]. 山东大学学报(工学版), 2017, 47(5): 187-194.
[5] 李明虎,李钢,钟麦英. 动态核主元分析在无人机故障诊断中的应用[J]. 山东大学学报(工学版), 2017, 47(5): 215-222.
[6] 刘卓,王天真,汤天浩,冯页帆,姚君琦,高迪驹. 一种多电平逆变器故障诊断与容错控制策略[J]. 山东大学学报(工学版), 2017, 47(5): 229-237.
[7] 张莉, 夏佩佩, 李凡长. 基于余弦相似性的供应商选择方法[J]. 山东大学学报(工学版), 2017, 47(1): 1-6.
[8] 何正义,曾宪华,曲省卫,吴治龙. 基于集成深度学习的时间序列预测模型[J]. 山东大学学报(工学版), 2016, 46(6): 40-47.
[9] 侯燕,杨猛. 高效解决复杂拓扑问题的显式界面追踪算法[J]. 山东大学学报(工学版), 2016, 46(4): 15-20.
[10] 孟令恒,丁世飞. 基于单静态图像的深度感知模型[J]. 山东大学学报(工学版), 2016, 46(3): 37-43.
[11] 王会青,孙宏伟,张建辉. 基于Map/Reduce的时间序列相似性搜索算法[J]. 山东大学学报(工学版), 2016, 46(1): 15-21.
[12] 陈宏兴, 周风余, 田天, 姜志飞, 陈竹敏. 服务机器人云计算平台SOA接口层模型设计[J]. 山东大学学报(工学版), 2015, 45(4): 31-39.
[13] 刘波,王有志*,安俊江,王艺霖,袁泉. 车辆—路面空间耦合振动模型及其动力响应分析[J]. 山东大学学报(工学版), 2014, 44(3): 83-89.
[14] 麻常辉1,冯江霞2,蒋哲1,武乃虎1,吕晓禄3. 基于时间序列和神经网络法的风电功率预测[J]. 山东大学学报(工学版), 2014, 44(1): 85-89.
[15] 张思懿1,2,王士同1*. 核化空间深度间距的特征提取方法[J]. 山东大学学报(工学版), 2012, 42(3): 45-51.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[2] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[3] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[4] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[5] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[6] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[7] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[8] 赵科军 王新军 刘洋 仇一泓. 基于结构化覆盖网的连续 top-k 联接查询算法[J]. 山东大学学报(工学版), 2009, 39(5): 32 -37 .
[9] 赵治广,王登杰,田云飞 . 基于灰色理论的路基沉降研究[J]. 山东大学学报(工学版), 2007, 37(3): 86 -88 .
[10] 姚占勇,商庆森,赵之仲,贾朝霞 . 界面条件对半刚性沥青路面结构应力分布的影响[J]. 山东大学学报(工学版), 2007, 37(3): 93 -99 .