Journal of Shandong University(Engineering Science) ›› 2020, Vol. 50 ›› Issue (4): 1-7.doi: 10.6040/j.issn.1672-3961.0.2019.423

• Machine Learning & Data Mining •     Next Articles

An integrated learning approach for O3 mass concentration prediction model

Yan PENG(),Tingting FENG,Jie WANG*()   

  1. School of Management, Captial Normal University, Beijing 100048, China
  • Received:2019-07-25 Online:2020-08-20 Published:2020-08-13
  • Contact: Jie WANG E-mail:pengyan@cnu.edu.cn;wangjie@cnu.edu.cn

Abstract:

In order to accurately predict O3 mass concentration and development trend and to analyze inducing factors, an O3 mass concentration prediction model based on integrated learning was proposed. A multilayer FS-IL model for the O3 pollutant mass concentration was established in accordance with the data of O3 pollutant mass concentration and meteorological factors from 2015 to 2016 in Beijing, on the basis of missing value filling and outlier analysis, Pearson correlation analysis and Lasso regression analysis were used to select features of the cleaned meteorological data to eliminate data redundancy and improve prediction accuracy; an integrated learning algorithm based on self-organizing featuremap (SOFM)-Elman neural network (ENN) was proposed. After clustering sample data with SOFM to realize reasonable distribution of samples, ENN was used for simulation training to predict O3 mass concentration. The experimental results showed that the accuracy of ENN-based O3 pollutant mass concentration prediction was improved from 74.6% to 82.1% after the preliminary processing of data with Pearson-Lasso feature selection and SOFM sample clustering.

Key words: Beijing, ozone, feature selection, SOFM, ENN

CLC Number: 

  • TP181

Fig.1

Structure of SOFM with one-dimensional competition layer"

Fig.2

Topology of Elman neural network"

Fig.3

Multilayer FS-IL Model"

Fig.4

Box chart of meteorological data"

Table 1

Pearson correlation number table"

Pearson相关系数 yi Pearson相关系数 yi
xi1 -0.59 xi6 0.69
xi2 -0.61 xi7 -0.09
xi3 -0.59 xi12 0.16
xi4 0.73 xi14 0.10
xi5 0.74 xi15 0.36

Table 2

Lasso feature selection coefficient"

变量 系数 变量 系数
xi1 0.0745 xi6 0.2682
xi2 -0.1147 xi7 -0.4166
xi3 0.0000 xi12 0.4466
xi4 -0.0823 xi14 0.1061
xi5 0.0415 xi15 0.1813

Fig.5

Clustering of SOFM neural network"

Fig.6

Comparison of prediction results test set of ENN"

Table 3

Comparison of simulation training results"

是否使用SOFM 使用SOFM聚类前 使用SOFM聚类后
ENN SVR RBF神经网络 BP神经网络 ENN SVR RBF神经网络 BP神经网络
MSE 0.5535 0.0649 0.5816 0.0630 0.0436 0.0451 0.4750 0.0478
R2 0.7462 0.7159 0.7318 0.7336 0.8211 0.8055 0.7932 0.8020
1 WANG Tao , XUE Likun , BRIMBLECOMBE Peter , et al. Ozone pollution in China: a review of concentrations, meteorological influences, chemical precursors, and effects[J]. Science of the Total Environment, 2017, 575 (1): 1582- 1596.
2 中华人民共和国生态环境保护部.2017中国生态环境状况公报[R].北京:中华人民共和国生态环境保护部, 2018.
Ministry of Environmental Protection of the People's Republic of China. China environmental status bulletin 2017[R]. Beijing: Ministry of Environmental Protection of the People's Republic of China, 2018.
3 LI Shuangjin , YANG Ning . Prediction and analysis of O3 based on the arima model[J]. Agricultural Science & Technology, 2015, 16 (10): 2146- 2148.
4 杜云松, 罗彬, 陈建文, 等. 气温在成都地区臭氧预报的运用研究[J]. 环境科学与技术, 2017, 40 (增刊1): 329- 334.
DU Yunsong , LUO Bin , CHEN Jianwen , et al. Study on the application of air temperature in ozone forecast in Chengdu area[J]. Environmental Science & Technology, 2017, 40 (Suppl.1): 329- 334.
5 陈博, 李迎春, 夏振平. 基于BP神经网络预测林内PM2.5浓度[J]. 安徽农业科学, 2019, 47 (1): 107- 110.
CHEN Bo , LI Yingchun , XIA Zhenping . Prediction of PM2.5 concentration in forest based on BP artificial neural network[J]. Journal of Anhui Agricultural Sciences, 2019, 47 (1): 107- 110.
6 张栗粽, 王谨平, 刘贵松, 等. 面向金融数据的神经网络时间序列预测模型[J]. 计算机应用研究, 2018, 35 (9): 2632- 2637.
ZHANG Lizong , WANG Jinping , LIU Guisong , et al. Neural network time series prediction model for financial data[J]. Application Research of Computers, 2018, 35 (9): 2632- 2637.
7 段满珍, 陈光, 张林, 等. 动态随机有效停车泊位预测方法[J]. 重庆交通大学学报(自然科学版), 2018, 36 (6): 81- 86.
DUAN Manzhen , CHEN Guang , ZHANG Lin , et al. Prediction method of dynamic stochastic effective parking space[J]. Journal of Chongqing Jiaotong University(Natural Science), 2018, 36 (6): 81- 86.
8 项丽萍, 杨红菊. 结合大数据流特征和改进SOM聚类的资源动态分配算法[J]. 计算机应用与软件, 2019, 36 (5): 262- 280.
XIANG Liping , YANG Hongju . Dynamic resource allocation algorithm based on big data stream characteristic and improved SOM clustering[J]. Computer Applications and Software, 2019, 36 (5): 262- 280.
9 金林, 李研. 几种相关系数辨析及其在R语言中的实现[J]. 统计与信息论坛, 2019, 34 (4): 3- 11.
JIN Lin , LI Yan . Discrimination of several correlation coefficients and their implementation in R software[J]. Statistics & Information Forum, 2019, 34 (4): 3- 11.
10 喻胜华, 龚尚花. 基于Lasso和支持向量机的粮食价格预测[J]. 湖南大学学报(社会科学版), 2016, 30 (1): 71- 72.
YU Shenghua , GONG Shanghua . A study on grain price prediction based on lasso and support vector machine[J]. Journal of Hunan University(Social Sciences), 2016, 30 (1): 71- 72.
11 董小刚, 刁亚静, 李慧玲, 等. 岭回归、LASSO回归和Adaptive-LASSO回归下的财政收入因素分析[J]. 吉林师范大学学报(自然科学版), 2018, 39 (2): 45- 53.
DONG Xiaogang , DIAO Yajing , LI Huiling , et al. The analysis of the fiscal revenue factors under the ridge regression, LASSO regression and the Adaptive-LASSO regression[J]. Jilin Normal University Journal(Natural Science Edition), 2018, 39 (2): 45- 53.
12 丁天一, 张旻. 一种SOFM网络的二阶段聚类算法[J]. 小型微型计算机系统, 2018, 39 (2): 329- 333.
DING Tianyi , ZHANG Min . Two-phase clustering algorithm based on self-organizing feature maps[J]. Journal of Chinese Computer Systems, 2018, 39 (2): 329- 333.
13 刘子英, 朱琛磊. 基于Elman神经网络模型的IGBT寿命预测[J]. 半导体技术, 2019, 44 (5): 395- 400.
LIU Ziying , ZHU Chenlei . IGBT life prediction based on Elman neural network model[J]. Semiconductor Technology, 2019, 44 (5): 395- 400.
14 李志新, 赖志琴, 龙云墨. 基于GA-Elman神经网络的参考作物需水量预测[J]. 节水灌溉, 2019, 44 (2): 117- 120.
LI Zhixin , LAI Zhiqin , LONG Yunmo . Prediction of water demand for reference crops based on GA-Elman neural network model[J]. Water Saving Irrigation, 2019, 44 (2): 117- 120.
15 金百锁, 李炽坤. 基于稳健S估计的长江流域气象异常值检测[J]. 中国科学技术大学学报, 2018, 48 (11): 869- 876.
JIN Baisuo , LI Chikun . Outlier detection of Yangtze River basin meteorological data based on robust S-estimator[J]. Journal of University of Science and Technology of China, 2018, 48 (11): 869- 876.
16 程志炜, 陈财森, 朱连军, 等. 基于Pearson相关系数的Cache计时模板攻击方法[J]. 计算机工程, 2019, 45 (7): 159- 163.
CHENG Zhiwei , CHEN Caisen , ZHU Lianjun , et al. Cache timing template attack method based on pearson correlation coefficient[J]. Computer Engineering, 2019, 45 (7): 159- 163.
17 ZHANG Zheng , XU Yong , YANG Jian , et al. A survey of sparse representation:algorithms and applications[J]. IEEE Access, 2015, 3, 490- 530.
18 高永, 郝晓丽, 吕进来. 互信息熵和Prewitt差测度的Lasso模型关键帧提取[J]. 中国科技论文, 2017, 12 (20): 2342- 2348.
GAO Yong , HAO Xiaoli , LÜ Jinlai . Lasso model key frame extraction for mutual information entropy and Prewitt difference measure[J]. China Sciencepaper, 2017, 12 (20): 2342- 2348.
19 邵惠芳, 赵昕宇, 许自成, 等. 基于SOFM网络的烤烟感官质量聚类模式分析[J]. 中国烟草学报, 2016, 22 (1): 13- 23.
SHAO Huifang , ZHAO Xinyu , XU Zicheng , et al. Clustering pattern analysis of sensory quality in flue-cured tobacco based on SOFM network[J]. Acta Tabacaria Sinica, 2016, 22 (1): 13- 23.
20 片坤, 徐晓钟, 张益铭. 一种改进的组合SOFM-SVR股票价格预测模型[J]. 计算机应用与软件, 2010, 27 (5): 172- 175.
PIAN Kun , XU Xiaozhong , ZHANG Yiming . An improved combined SOFM-SVR model for stock price prediction[J]. Computer Applications and Software, 2010, 27 (5): 172- 175.
[1] Xin MA,Xue WANG. Prediction of microRNA-binding residues based on Laplacian support vector machine and sequence information [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 76-82.
[2] Jiachen WANG,Xianghong TANG,Jianguang LU. Research onfeature selection technology in bearing fault diagnosis [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 80-87, 95.
[3] Shulei JIANG,Shifeng YANG,Linghang YANG,Wenming SHI,Fuqing ZOU,Qinghong SHAN. The technology of spanning deep pond of the urban expressway nearby paralleling high-speed railway [J]. Journal of Shandong University(Engineering Science), 2019, 49(1): 91-100.
[4] Hong CHEN,Xiaofei YANG,Qing WAN,Yingcang MA. Multi-label feature selection algorithm based on correntropy andmanifold learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 27-36.
[5] Lianming MOU. Weighted k sub-convex-hull classifier based on adaptive feature selection [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 32-37.
[6] CHEN Zhiwen, PENG Tao, YANG Chunhua , HE Zhangming, YANG Chao, YANG Xiaoyue. A fault detection method based on modified canonical correlation analysis [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 44-50.
[7] WANG Lei, DENG Xiaogang, CAO Yuping, TIAN Xuemin. Multiblock local Fisher discriminant analysis for chemical process fault classification [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 179-186.
[8] XIAO Di, LIAN Jing, JI Shaobo, ZHAO Shengjin, XU Huaimin. Influence of ozone addition on laminar flame speed in methane-air lean mixtures [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(4): 59-63.
[9] LI Sushu, WANG Shitong, LI Tao. A feature selection method based on LS-SVM and fuzzy supplementary criterion [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(3): 34-42.
[10] FANG Hao, LI Yun. Random undersampling and POSS method for software defect prediction [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(1): 15-21.
[11] LYU Zhen, LI Suxue, ZHANG Chuanting, YUAN Dongfeng. An improved CNM algorithm based on network structure information [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(1): 37-41.
[12] MO Xiaoyong, PAN Zhisong, QIU Junyang, YU Yajun, JIANG Mingchu. Anomaly detection in network traffic based on online feature selection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(4): 21-27.
[13] GAO Yang, KONG Fanmin, LI Kang. Research on the electromagnetic responses of azimuthal resistivity logging in formation boundaries [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(6): 99-106.
[14] XU Lingwei, ZHANG Hao, GULLIVER T A. Performance analysis of TAS/SEC system under N-Nakagami fading channels [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(4): 84-90.
[15] HAN Zhongming, WU Yang, TAN Xusheng, LIU Wen, YANG Weijie. Comparison and analysis on measure indexes for structural hole nodes in social network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 1-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] ZHANG Yong-hua,WANG An-ling,LIU Fu-ping . The reflected phase angle of low frequent inhomogeneous[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 22 -25 .
[2] KONG Xiang-zhen,LIU Yan-jun,WANG Yong,ZHAO Xiu-hua . Compensation and simulation for the deadband of the pneumatic proportional valve[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 99 -102 .
[3] CHEN Rui, LI Hongwei, TIAN Jing. The relationship between the number of magnetic poles and the bearing capacity of radial magnetic bearing[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(2): 81 -85 .
[4] QIN Tong, SUN Fengrong*, WANG Limei, WANG Qinghao, LI Xincai. 3D surface reconstruction using the shape based interpolation guided by maximal discs[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 1 -5 .
[5] LIU Wen-liang, ZHU Wei-hong, CHEN Di, ZHANG Hong-quan. Detection and tracking of moving targets using the morphology match in radar images[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 31 -36 .
[6] ZHANG Ying,LANG Yongmei,ZHAO Yuxiao,ZHANG Jianda,QIAO Peng,LI Shanping . Research on technique of aerobic granular sludge cultivationby seeding EGSB anaerobic granular sludge[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(4): 56 -59 .
[7] SUN Guohua, WU Yaohua, LI Wei. The effect of excise tax control strategy on the supply chain system performance[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 63 -68 .
[8] LIU Zhongguo,ZHANG Xiaojing,LIU Boqiang,LIU Changchun, . The development of ultrasonic characterization of the biological tissue elasticity[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 34 -38 .
[9] SUN Dianzhu, ZHU Changzhi, LI Yanrui. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 84 -86 .
[10] ZOU Feifei,GUAN Xiaojun,HAN Zhenqiang,SHEN Xiaomin,MA Xiaofei ,LIU Yunteng . hermal simulating experiment and FEM simulation of dynamic recrystallization of 09CuPTiRE steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(5): 17 -20 .