山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (2): 91-99.doi: 10.6040/j.issn.1672-3961.0.2019.404
高铭壑1(),张莹1,*(),张蓉蓉1,黄子豪1,黄琳焱1,李繁菀1,张昕2,王彦浩1
Minghe GAO1(),Ying ZHANG1,*(),Rongrong ZHANG1,Zihao HUANG1,Linyan HUANG1,Fanyu LI1,Xin ZHANG2,Yanhao WANG1
摘要:
采用LightGBM预测模型对空气质量预测问题进行研究,提出并设计一种基于预测性特征的空气质量预测方法,有效地预测北京市区内未来24 h核心表征空气质量的PM2.5质量浓度。在构建预测方案过程中,分析训练数据集特性开展数据清洗,利用随机森林与线性插值相结合的方法,解决数据大量缺失以及噪声干扰问题;提出使用预测性数据特征方法,同时设计相关统计特征,提高预测结果的准确性;采用滑窗机制挖掘高维时间特征,增加数据特征数量级;对预测模型的工作性能和结果进行详细分析,并结合基线模型进行对比评价。试验结果表明,基于预测性特征结合采用LightGBM预测模型的方案具有更高的预测精度。
中图分类号:
1 | HUANG J , DUAN N , JI P , et al. A crowd source-based sensing system for monitoring fine-grained air quality in urban environments[J]. IEEE Internet of Things Journal, 2018, 6 (2): 3240- 3247. |
2 |
LI X , PENG L , HU Y , et al. Deep learning architecture for air quality predictions[J]. Environmental Science and Pollution Research, 2016, 23 (22): 22408- 22417.
doi: 10.1007/s11356-016-7812-9 |
3 |
ZHOU Q , JIANG H , WANG J , et al. A hybrid model for PM2.5 forecasting based on ensemble empirical mode decomposition and a general regression neural network[J]. Science of the Total Environment, 2014, 496, 264- 274.
doi: 10.1016/j.scitotenv.2014.07.051 |
4 |
HOCHREITER S , SCHMIDHUBER J . Long short-term memory[J]. Neural Computation, 1997, 9 (8): 1735- 1780.
doi: 10.1162/neco.1997.9.8.1735 |
5 |
COSMA A C , SIMHA R . Machine learning method for real-time non-invasive prediction of individual thermal preference in transient conditions[J]. Building and Environment, 2019, 148, 372- 383.
doi: 10.1016/j.buildenv.2018.11.017 |
6 | ZHU D , CAI C , YANG T , et al. A machine learning approach for air quality prediction: model regularization and optimization[J]. Big Data and Cognitive Computing, 2018, 2 (1): 1- 15. |
7 |
WANG D , WEI S , LUO H , et al. A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine[J]. Science of the Total Environment, 2017, 580, 719- 733.
doi: 10.1016/j.scitotenv.2016.12.018 |
8 |
MAHAJAN S , LIU H M , TSAI T C , et al. Improving the accuracy and efficiency of PM2.5 forecast service using cluster-based hybrid neural network model[J]. IEEE Access, 2018, 6, 19193- 19204.
doi: 10.1109/ACCESS.2018.2820164 |
9 | ZHENG Y, YI X, LI M, et al. Forecasting fine-grained air quality based on big data[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney, Australia: Associ-ation for Computing Machinery, 2015: 2267-2276. |
10 | ZHANG C, YUAN D. Fast fine-grained air quality index level prediction using random forest algorithm on cluster computing of spark[C]//Proceeding of 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom). Beijing, China: IEEE, 2015: 929-934. |
11 |
GAO M , YIN L , NING J . Artificial neural network model for ozone concentration estimation and Monte Carlo analysis[J]. Atmospheric Environment, 2018, 184, 129- 139.
doi: 10.1016/j.atmosenv.2018.03.027 |
12 | ZHENG Y, LIU F, HSIEH HP. U-air: when urban air quality inference meets big data[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago, USA: Association for Computing Machinery, 2013: 1436-1444. |
13 | HSIEH H P, LIN S D, ZHENG Y. Inferring air quality for station location recommendation based on urban big data[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney, Australia: Association for Computing Machinery, 2015: 437-446. |
14 |
WANG J , SONG G . A deep spatial-temporal ensemble model for air quality prediction[J]. Neurocomputing, 2018, 314, 198- 206.
doi: 10.1016/j.neucom.2018.06.049 |
15 | HUANG C J , KUO P H . A deep cnn-lstm model for particulate matter (PM2.5) forecasting in smart cities[J]. Sensors, 2018, 18 (7): 1- 22. |
16 | SUN W, DUAN N, JI P, et al. Intelligent in-vehicle air quality management: a smart mobility application dealing with air pollution in the traffic[C]//Proceeding of 23rd ITS World Congress. Melbourne, Australia: Intelligent Transport Systems Australia, 2016: 1-12. |
17 | MA C, DUAN N, SUN W, et al. Reducing air pollution exposure in a road trip[C]//Proceeding of 24rd ITS World Congress. Montreal, Canada: Intelligent Transport Systems Australia, 2017: 1-12. |
18 |
CHENG Y , ZHANG S , HUAN C , et al. Optimization on fresh outdoor air ratio of air conditioning system with stratum ventilation for both targeted indoor air quality and maximal energy saving[J]. Building and Environment, 2019, 147, 11- 22.
doi: 10.1016/j.buildenv.2018.10.009 |
19 | SUN W, ZHU J, DUAN N, et al. Moving object map analytics: a framework enabling contextual spatial-temporal analytics of Internet of Things applications[C]//Proceeding of 2016 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI). Beijing, China: IEEE, 2016: 101-106. |
20 | ROY S S, PRATYUSH C, BARNA C. Predicting ozone layer concentration using multivariate adaptive regression splines, random forest and classification and regression tree[C]//Proceeding of International Workshop Soft Computing Applications. Arad, Romania: Springer, 2016: 140-152. |
21 | CHANG J C , HANNA S R . Air quality model performance evaluation[J]. Meteorology and Atmospheric Physics, 2004, 87 (1/2/3): 167- 196. |
22 | MEIJERING E . A chronology of interpolation: from ancient astronomy to modern signal and image processing[J]. Proceedings of the IEEE, 2002, 90 (3): 319- 342. |
23 | KE G, MENG Q, FINLEY T, et al. Lightgbm: a highly efficient gradient boosting decision tree[C]//Proceeding of 31st Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates, Inc., 2017: 3146-3154. |
24 | FRIEDMAN JH . Greedy function approximation: a gradient boosting machine[J]. Annals of Statistics, 2001, 29 (5): 1189- 1232. |
[1] | 张大鹏,刘雅军,张伟,沈芬,杨建盛. 基于异质集成学习的虚假评论检测[J]. 山东大学学报 (工学版), 2020, 50(2): 1-9. |
[2] | 刘玉田, 孙润稼, 王洪涛, 顾雪平. 人工智能在电力系统恢复中的应用综述[J]. 山东大学学报 (工学版), 2019, 49(5): 1-8. |
[3] | 李童,马然,郑鸿鹤,安平,胡翔宇. 基于视频统计特征的差错敏感度模型[J]. 山东大学学报 (工学版), 2019, 49(2): 116-121. |
[4] | 邹启杰,李昊宇,张汝波,裴腾达,刘艳. 自主驾驶的人机交互控制[J]. 山东大学学报 (工学版), 2019, 49(2): 23-33. |
[5] | 张冕,黄颖,梅海艺,郭毓. 基于Kinect的配电作业机器人智能人机交互方法[J]. 山东大学学报 (工学版), 2018, 48(5): 103-108. |
[6] | 刘洋,刘博,王峰. 基于Parameter Server框架的大数据挖掘优化算法[J]. 山东大学学报(工学版), 2017, 47(4): 1-6. |
[7] | 魏波,张文生,李元香,夏学文,吕敬钦. 一种选择特征的稀疏在线学习算法[J]. 山东大学学报(工学版), 2017, 47(1): 22-27. |
[8] | 周旺,张晨麟,吴建鑫. 一种基于Hartigan-Wong和Lloyd的定性平衡聚类算法[J]. 山东大学学报(工学版), 2016, 46(5): 37-44. |
[9] | 孟令恒,丁世飞. 基于单静态图像的深度感知模型[J]. 山东大学学报(工学版), 2016, 46(3): 37-43. |
[10] | 刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83. |
[11] | 郑毅, 朱成璋. 基于深度信念网络的PM2.5预测[J]. 山东大学学报(工学版), 2014, 44(6): 19-25. |
[12] | 谢琳1,殷熙尧2,李凡长3,吴佳3. 一种逆归结学习表示[J]. 山东大学学报(工学版), 2013, 43(4): 46-50. |
[13] | 何雪英1,2, 秦伟1, 尹义龙1*, 赵联征1,乔昊3. 基于机器学习的视频指纹识别[J]. 山东大学学报(工学版), 2011, 41(4): 29-33. |
[14] | 梁春林1,彭凌西2*. 基于免疫网络的无监督式分类算法[J]. 山东大学学报(工学版), 2010, 40(5): 82-86. |
[15] | 郭茂祖 邹权 李文滨 韩英鹏. 生物信息学中的学习问题[J]. 山东大学学报(工学版), 2009, 39(3): 1-6. |
|