Tags

Type your tag names separated by a space and hit enter

A feature selection and multi-model fusion-based approach of predicting air quality.
ISA Trans 2019IT

Abstract

With the rapid development of China's industrialization, the air pollution is becoming more and more serious. It is vital for us to predict the air quality for determining the further prevention measures of avoiding the brought disasters. In this paper, we are going to propose an approach of predicting the air quality based on the multiple data features through fusing the multiple machine learning models. The approach takes the meteorological data and air quality data for the past six days as one batch of input (the whole data set is for 46 days) and employs a multi-model fusion to provide an improved 24-hour prediction of PM2.5 pollutant concentration all over Beijing. During the above process, two focal feature groups are composed. The first focal feature group contains the historical meteorological data, while the second group includes the statistical information, the date information and the polynomial variations. Besides the two groups, we complement one million more data items by employing the time sliding means. Among the supplementary data, we select the most critical 500 features with Light Gradient Boosting Machine (LightGBM) model and send the features as the input to Gradient Boosting Decision Tree (GBDT) and LightGBM models. Meanwhile, we screen the most critical 300 features with eXtreme Gradient Boosting (XGBoost) model and send them as the input to the three prediction models. Referring to each of the models, we respectively gain the optimal parameters through grid search methods and then fuse the models' contribution with the linear weighting. The experiments indicate that the proposed approach based on the weighting fusion is better than that provided by a single modeling scheme, and the loss value is 0.4158 under the SMAPE index.

Authors+Show Affiliations

North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: yingzhang@ncepu.edu.cn.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: 1160099123@qq.com.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: 1091714856@qq.com.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: somnusxiaojian@126.com.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: 478962646@qq.com.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: hack_haozi@163.com.North China Electric Power University, School of Control and Computer Engineering, Beijing, 102206, China. Electronic address: 3382911357@qq.com.

Pub Type(s)

Journal Article

Language

eng

PubMed ID

31812248

Citation

Zhang, Ying, et al. "A Feature Selection and Multi-model Fusion-based Approach of Predicting Air Quality." ISA Transactions, 2019.
Zhang Y, Zhang R, Ma Q, et al. A feature selection and multi-model fusion-based approach of predicting air quality. ISA Trans. 2019.
Zhang, Y., Zhang, R., Ma, Q., Wang, Y., Wang, Q., Huang, Z., & Huang, L. (2019). A feature selection and multi-model fusion-based approach of predicting air quality. ISA Transactions, doi:10.1016/j.isatra.2019.11.023.
Zhang Y, et al. A Feature Selection and Multi-model Fusion-based Approach of Predicting Air Quality. ISA Trans. 2019 Dec 2; PubMed PMID: 31812248.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - A feature selection and multi-model fusion-based approach of predicting air quality. AU - Zhang,Ying, AU - Zhang,Rongrong, AU - Ma,Qunfei, AU - Wang,Yanhao, AU - Wang,Qingqing, AU - Huang,Zihao, AU - Huang,Linyan, Y1 - 2019/12/02/ PY - 2019/05/05/received PY - 2019/11/15/revised PY - 2019/11/17/accepted PY - 2019/12/9/entrez PY - 2019/12/10/pubmed PY - 2019/12/10/medline KW - Air quality prediction KW - Feature selection KW - Machine learning KW - Model fusion JF - ISA transactions JO - ISA Trans N2 - With the rapid development of China's industrialization, the air pollution is becoming more and more serious. It is vital for us to predict the air quality for determining the further prevention measures of avoiding the brought disasters. In this paper, we are going to propose an approach of predicting the air quality based on the multiple data features through fusing the multiple machine learning models. The approach takes the meteorological data and air quality data for the past six days as one batch of input (the whole data set is for 46 days) and employs a multi-model fusion to provide an improved 24-hour prediction of PM2.5 pollutant concentration all over Beijing. During the above process, two focal feature groups are composed. The first focal feature group contains the historical meteorological data, while the second group includes the statistical information, the date information and the polynomial variations. Besides the two groups, we complement one million more data items by employing the time sliding means. Among the supplementary data, we select the most critical 500 features with Light Gradient Boosting Machine (LightGBM) model and send the features as the input to Gradient Boosting Decision Tree (GBDT) and LightGBM models. Meanwhile, we screen the most critical 300 features with eXtreme Gradient Boosting (XGBoost) model and send them as the input to the three prediction models. Referring to each of the models, we respectively gain the optimal parameters through grid search methods and then fuse the models' contribution with the linear weighting. The experiments indicate that the proposed approach based on the weighting fusion is better than that provided by a single modeling scheme, and the loss value is 0.4158 under the SMAPE index. SN - 1879-2022 UR - https://www.unboundmedicine.com/medline/citation/31812248/A_feature_selection_and_multi-model_fusion-based_approach_of_predicting_air_quality L2 - https://linkinghub.elsevier.com/retrieve/pii/S0019-0578(19)30503-8 DB - PRIME DP - Unbound Medicine ER -