## Short-term air pollution forecasting based on environmental factors and deep learning models

### Mirche Arsov, Eftim Zdravevski, Petre Lameski, Roberto Corizzo, Nikola Koteli, Kosta Mitreski, Vladimir Trajkovik

DOI: http://dx.doi.org/10.15439/2020F211

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 15–22 (2020)

Abstract. The effects of air pollution on people, the environment, and the global economy are profound - and often under-recognized. Air pollution is becoming a global problem. Urban areas have dense populations and a high concentration of emission sources: vehicles, buildings, industrial activity, waste, and wastewater. Tackling air pollution is an immediate problem in developing countries, such as North Macedonia, especially in larger urban areas. This paper exploits Recurrent Neural Network (RNN) models with Long Short-Term Memory units to predict the level of PM10 particles in the near future (+3 hours), measured with sensors deployed in different locations in the city of Skopje. Historical air quality measurements data were used to train the models. In order to capture the relation of air pollution and seasonal changes in meteorological conditions, we introduced temperature and humidity data to improve the performance. The accuracy of the models is compared to PM10 concentration forecast using an Autoregressive Integrated Moving Average (ARIMA) model. The obtained results show that specific deep learning models consistently outperform the ARIMA model, particularly when combining meteorological and air pollution historical data. The benefit of the proposed models for reliable predictions of only 0.01 MSE could facilitate preemptive actions to reduce air pollution, such as temporarily shutting main polluters, or issuing warnings so the citizens can go to a safer environment and minimize exposure.

### References

- A. S. Whittemore, “Air pollution and respiratory disease,” Annual review of public health, vol. 2, no. 1, pp. 397–429, 1981.
- M. R. Heal, P. Kumar, and R. M. Harrison, “Particles, air quality, policy and health,” Chemical Society Reviews, vol. 41, no. 19, pp. 6606–6630, 2012.
- R. Arasa, M. Picanyol, and J. Solé, “Analysis of the integrated environmental and meteorological forecasting and alert system (siam) for air quality applications over different regions of the iberian peninsula,” in Proceedings of HARMO15 Congress. Madrid. http://www.harmo. org/Conferences/Proceedings/ Madrid/publishedSections/H15-70.pdf, 2013.
- G. Fronza and P. Melli, Mathematical Models for Planning and Controlling Air Quality: Proceedings of an October 1979 IIASA Workshop. Elsevier, 2014.
- D. Slezak, M. Grzegorowski, A. Janusz, M. Kozielski, S. H. Nguyen, M. Sikora, S. Stawicki, and L. Wrobel, “A framework for learning and embedding multi-sensor forecasting models into a decision support system: A case study of methane concentration in coal mines,” Information Sciences, vol. 451-452, pp. 112 – 133, 2018.
- A. Janusz, M. Grzegorowski, M. Michalak, L. Wrobel, M. Sikora, and D. Slezak, “Predicting seismic events in coal mines based on underground sensor measurements,” Engineering Applications of Artificial Intelligence, vol. 64, pp. 83–94, 2017.
- E. Zdravevski, P. Lameski, R. Mingov, A. Kulakov, and D. Gjorgjevikj, “Robust histogram-based feature engineering of time series data,” in 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), 2015, pp. 381–388.
- A. Janusz, D. Slezak, M. Sikora, and L. Wrobel, “Predicting dangerous seismic events: Aaia’16 data mining challenge,” in 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, pp. 205–211.
- E. Zdravevski, P. Lameski, and A. Kulakov, “Automatic feature engineering for prediction of dangerous seismic activities in coal mines,” in 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, pp. 245–248.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- A. C. Tsoi and A. Back, “Discrete time recurrent neural network architectures: A unifying review,” Neurocomputing, vol. 15, no. 3-4, pp. 183–223, 1997.
- L. Yunpeng, H. Di, B. Junpeng, and Q. Yong, “Multi-step ahead time series forecasting for different data patterns based on lstm recurrent neural network,” in 2017 14th Web Information Systems and Applications Conference (WISA). IEEE, 2017, pp. 305–310.
- M. Ceci, R. Corizzo, D. Malerba, and A. Rashkovska, “Spatial autocorrelation and entropy for renewable energy forecasting,” Data Mining and Knowledge Discovery, vol. 33, no. 3, pp. 698–729, 2019.
- A. Tokgöz and G. Ünal, “A rnn based time series approach for forecasting turkish electricity load,” in 2018 26th Signal Processing and Communications Applications Conference (SIU). IEEE, 2018, pp. 1–4.
- B. B. Sahoo, R. Jha, A. Singh, and D. Kumar, “Long short-term memory (lstm) recurrent neural network for low-flow hydrological time series forecasting,” Acta Geophysica, vol. 67, no. 5, pp. 1471–1481, 2019.
- R. Corizzo, M. Ceci, H. Fanaee-T, and J. Gama, “Multi-aspect renewable energy forecasting,” Information Sciences, 2020.
- V. Stojov, N. Koteli, P. Lameski, and E. Zdravevski, “Application of machine learning and time-series analysis for air pollution prediction,” in CIIT 2018, 2018.
- Y.-T. Tsai, Y.-R. Zeng, and Y.-S. Chang, “Air pollution forecasting using rnn with lstm,” in 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, 2018, pp. 1074–1079.
- R. Corizzo, M. Ceci, E. Zdravevski, and N. Japkowicz, “Scalable auto-encoders for gravitational waves detection from time series data,” Expert Systems with Applications, vol. 151, p. 113378, 2020.
- A. Zhao, L. Qi, J. Dong, and H. Yu, “Dual channel lstm based multi-feature extraction in gait for diagnosis of neurodegenerative diseases,” Knowledge-Based Systems, vol. 145, pp. 91–97, 2018.
- B. Petrovska, E. Zdravevski, P. Lameski, R. Corizzo, I. Štajduhar, and J. Lerga, “Deep learning for feature extraction in remote sensing: A case-study of aerial scene classification,” Sensors, vol. 20, no. 14, p. 3906, 2020.
- B. Petrovska, T. Atanasova-Pacemska, R. Corizzo, P. Mignone, P. Lameski, and E. Zdravevski, “Aerial scene classification through fine-tuning with adaptive learning rates and label smoothing,” Applied Sciences, 2020.
- S. Ryan, R. Corizzo, I. Kiringa, and N. Japkowicz, “Pattern and anomaly localization in complex and dynamic data,” in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019, pp. 1756–1763.
- U. Kumar and V. Jain, “Arima forecasting of ambient air pollutants (o 3, no, no 2 and co),” Stochastic Environmental Research and Risk Assessment, vol. 24, no. 5, pp. 751–760, 2010.
- J. Zhang and K. Man, “Time series prediction using rnn in multidimension embedding phase space,” in SMC’98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218), vol. 2. IEEE, 1998, pp. 1868–1873.
- K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint https://arxiv.org/abs/1406.1078, 2014.
- Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent neural networks for multivariate time series with missing values,” Scientific reports, vol. 8, no. 1, pp. 1–12, 2018.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint https://arxiv.org/abs/1412.6980, 2014.
- F. Chollet et al., “Keras,” https://github.com/fchollet/keras, 2015.