您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (5): 143-149.doi: 10.6040/j.issn.1672-3961.0.2017.172

• • 上一篇    下一篇

基于PCA相似度和谱聚类相结合的高炉历史数据聚类

庞人铭1,王波1,叶昊1*,张海峰2,李明亮2   

  1. 1. 清华大学自动化系, 北京 100084;2. 广西柳州钢铁(集团)公司, 广西 柳州 545002
  • 收稿日期:2017-02-10 出版日期:2017-10-20 发布日期:2017-02-10
  • 通讯作者: 叶昊(1969— ),男,天津人,博士生导师,工学博士,主要研究方向为动态系统故障诊断.E-mail:haoye@tsinghua.edu.cn E-mail:prm14@mails.tsinghua.edu.cn
  • 作者简介:庞人铭(1992— ),男,湖南常德人,硕士研究生,主要研究方向为过程监控和故障诊断.E-mail:prm14@mails.tsinghua.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61290324)

Clustering of blast furnace historical data based on PCA similarity factor and spectral clustering

PANG Renming1, WANG Bo1, YE Hao1*, ZHANG Haifeng2, LI Mingliang2   

  1. 1. Department of Automation, Tsinghua University, Beijing 100084, China;
    2. Liuzhou Iron and Steel Co. Ltd., Liuzhou 545002, Guangxi, China
  • Received:2017-02-10 Online:2017-10-20 Published:2017-02-10

摘要: 将主元分析(principal component analysis, PCA)模型相似度(以下简称PCA相似度)和谱聚类(spectral clustering)算法相结合,并用于基于高炉历史数据挖掘的炉况工作点变化的分析。利用PCA相似度与距离相似度的加权来衡量滑窗数据集之间的相似度,进一步将数据集的聚类问题转化为图的最优划分问题,通过谱聚类得到聚类结果。该方法降低了高炉工作点漂移的影响,能够有效稳定的实现高炉炉况工作点的聚类。基于现场历史数据的离线测试表明:与已有的基于PCA相似度和k-means聚类的算法对比,本研究可以更加有效区分炉况工作点的跳变。

关键词: 谱聚类, 高炉, 多工况, 工作点漂移, PCA相似度, 数据挖掘

Abstract: The principal component analysis(PCA)similarity factor and spectral clustering algorithms were combined and applied analyze the operational state change in a blast furnace by mining the historical data. The similarity between different data sets generated from moving windows by combining the PCA similarity factor and the distance similarity factor was measured, and the historical data were clustered by constructing the graph from the similarity between different data sets and using spectral clustering algorithm. The effect of operating point drift was reduced and the more accurate clustering result was effectively and steadily achieved by the proposed method. The off-line test proved that, compared with the existing methods which combined the PCA similarity factor and k-means clustering, the proposed method could more effectively recognize the operational state change in a blast furnace.

Key words: PCA similarity factor, blast furnace, multimode, operating point drift, spectral clustering, data mining

中图分类号: 

  • TP277
[1] HAN J, KAMBER M. Data mining: concepts and techniques(the Morgan Kaufmann series in data management systems)[J]. Antimicrobial Agents and Chemotherapy, 2015, 59(3):1435-1440.
[2] YU F X, SUO Y N, ZHANG X, et al. Data mining in blast furnace smelting parameter[J]. Applied Mechanics and Materials, 2013, 303-306: 1093-1096.
[3] 明菲. 关联规则挖掘在高炉炉况预测中的应用研究[D]. 重庆:重庆大学, 2009. MING Fei. Research on application of association rule mining to blast furnace situation prediction[D]. Chongqing: Chongqing University, 2009.
[4] ZHANG T, YE H, WANG W, et al. Fault diagnosis for blast furnace ironmaking process based on two-stage principal component analysis[J]. ISIJ International, 2014, 54(10): 2334-2341.
[5] ZHOU B, YE H, ZHANG H, et al. Process monitoring of iron-making process in a blast furnace with PCA-based methods[J]. Control Engineering Practice, 2016, 47: 1-14.
[6] 苏鑫,吴迎亚,裴华健,等. 大数据技术在过程工业中的应用研究进展[J]. 化工进展, 2016, 35(6):1652-1659. SU Xin, WU Yingya, PEI Huajian, et al. Recent development of the application of big data technology in process industries[J]. Chemical Industry and Engineering Progress, 2016, 35(6):1652-1659.
[7] ZULLO L. Validation and verification of continuous plants operating modes using multivariate statistical methods[J]. Computers & Chemical Engineering, 1996, 20(12): S683-S688.
[8] NATARAJAN S, SRINIVASAN R. Multi-model based process condition monitoring of offshore oil and gas production process[J]. Chemical Engineering Research and Design, 2010, 88(5-6): 572-591.
[9] KRZANOWSKI W J. Between-groups comparison of principal components[J]. Journal of the American Statistical Association, 1979, 74(367): 703-707.
[10] YAO Y, GAO F. Phase and transition based batch process modeling and online monitoring[J]. Journal of Process Control, 2009, 19(5): 816-826.
[11] JOHANNESMEYER M C. Abnormal situation analysis using pattern recognition techniques and historical data[D]. Santa Barbara:University of California, 1999.
[12] 李秀玉, 张成, 逄玉俊. 基于PCA的相似度方法在半导体产品分类中的应用[J]. 沈阳化工大学学报, 2013, 27(1):58-62. LI Xiuyu, ZHANG Cheng, PANG Yujun. Application of PCA similarity factor in classification of semiconductor products[J]. Journal of Shenyang University of Chemical Technology, 2013, 27(1):58-62.
[13] SINGHAL A, SEBORG D E. Pattern matching in multivariate time series databases using a moving-window approach[J]. Industrial & Engineering Chemistry Research, 2002, 41(16): 3822-3838.
[14] 蔡晓妍, 戴冠中, 杨黎斌. 谱聚类算法综述[J]. 计算机科学, 2008, 35(7):14-18. CAI Xiaoyan, DAI Guanzhong, YANG Libin. Survey on spectral clustering algorithms[J]. Computer Science, 2008, 35(7):14-18.
[15] LUXBURG U. A tutorial on spectral clustering[J].Statistics and Computing, 2007,17(4): 395-416.
[16] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm[C] //Neural Information Processing Systems: Natural and Synthetic. Vancouver, Canada:MIT Press, 2001, 14(2): 849-856.
[17] 张亚平. 谱聚类算法及其应用研究[D].太原:中北大学, 2014. ZHANG Yaping. Spectral clustering algorithm and its application research[D]. Taiyuan:North University of China, 2014.
[18] SINGHAL A, SEBORG D E. Clustering multivariate time-series data[J]. Journal of Chemometrics, 2005, 19(8): 427-438.
[19] 窦克勤, 叶昊, 张海峰,等. 基于主元分析的高炉异常炉况检测[J]. 上海交通大学学报, 2015, 49(12):1862-1867. DOU Keqin, YE Hao, ZHANG Haifeng, et al. Fault detection for ironmaking process of blast furnace based on PCA[J]. Journal of Shanghai Jiaotong University, 2015, 49(12):1862-1867.
[1] 樊淑炎, 丁世飞. 基于多尺度的改进Graph cut算法[J]. 山东大学学报(工学版), 2016, 46(1): 28-33.
[2] 周哲, 商琳. 一种基于动态词典和三支决策的情感分析方法[J]. 山东大学学报(工学版), 2015, 45(1): 19-23.
[3] 王兴良,王立宏*,李海军. 谱聚类中特征向量的Bagging选取方法[J]. 山东大学学报(工学版), 2013, 43(2): 35-41.
[4] 朱全银1,严云洋1,周培1,谷天峰2. 一种线性插补与自适应滑动窗口价格预测模型[J]. 山东大学学报(工学版), 2012, 42(5): 53-58.
[5] 琚春华1,2,陈之奇1*. 一种挖掘概念漂移数据流的模糊积分集成分类方法[J]. 山东大学学报(工学版), 2011, 41(4): 44-48.
[6] 宋威,刘文博,李晋宏. 基于动态裁剪频繁模式树的频繁项集并发挖掘算法[J]. 山东大学学报(工学版), 2011, 41(4): 49-55.
[7] 王爱国,李廉*,杨静,陈桂林. 一种基于Bayesian网络的网页推荐算法[J]. 山东大学学报(工学版), 2011, 41(4): 137-142.
[8] 张新猛,蒋盛益. 一种基于相似度概率的不确定分类数据聚类算法[J]. 山东大学学报(工学版), 2011, 41(3): 12-16.
[9] 孙静宇,余雪丽,陈俊杰, 李鲜花. 采样特异性因子及异常检测[J]. 山东大学学报(工学版), 2010, 40(5): 56-59.
[10] 陈光 崔玲 高云凯. 大客车车身结构多工况综合优化分析[J]. 山东大学学报(工学版), 2009, 39(6): 88-91.
[11] 董乃鹏 赵合计 SCHOMMER Christoph. 作者写作特征提取引擎[J]. 山东大学学报(工学版), 2009, 39(5): 27-31.
[12] 卜德云 张道强. 自适应谱聚类算法研究[J]. 山东大学学报(工学版), 2009, 39(5): 22-26.
[13] 孙宇清,赵锐,姚青,史斌,刘佳 . 一种基于网格的障碍约束下空间聚类算法[J]. 山东大学学报(工学版), 2006, 36(3): 86-90 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!