Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 21

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

A Framework for Time Series Preprocessing and History-based Forecasting Method Recommendation

,

DOI: http://dx.doi.org/10.15439/2020F101

Citation: Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 21, pages 141144 ()

Full text

Abstract. The complexity of managing the capacities of large IT infrastructures is constantly increasing as more network devices are connected. This task can no longer be performed manually, so the system must be monitored at runtime and estimations of future conditions must be made automatically. However, since using a single forecasting method typically performs poorly, this paper presents a framework for forecasting univariate network device workload traces using multiple forecasting methods. First, the time series are preprocessed by imputing missing data and removing anomalies. Then, different features are derived from the univariate time series, depending on the type of forecasting method. In addition, a recommendation approach for selecting the most suitable forecasting method from this set of algorithms for each time series based only on its historical values is proposed. For this purpose, the performance of the forecasting methods is approximated using the historical data of the respective time series under consideration. The framework is used in the FedCSIS 2020 Challenge and shows good forecasting quality with an average $R^2$ score of 0.2575 on the small test data set.

References

  1. A. Janusz, M. Przyborowski et al., “Network Device Workload Prediction: A Data Mining Challenge at Knowledge Pit,” in Proceedings of FedCSIS 2020, Sofia, Bulgaria, 2020.
  2. D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Trans. on Evol. Computation, vol. 1, no. 1, 1997. http://dx.doi.org/10.1109/4235.585893
  3. R. N. Calheiros, E. Masoumi et al., “Workload prediction using arima model and its impact on cloud applications’ qos,” IEEE Trans. on Cloud Computing, vol. 3, no. 4, 2014. http://dx.doi.org/10.1109/tcc.2014.2350475
  4. K. Cetinski and M. B. Juric, “Ame-wpc: Advanced model for efficient workload prediction in the cloud,” Journal of Network and Computer Applications, vol. 55, 2015. http://dx.doi.org/10.1016/j.jnca.2015.06.001
  5. J. M. Bates and C. W. Granger, “The combination of forecasts,” Journal of the Oper. Res. Society, vol. 20, no. 4, 1969. http://dx.doi.org/10.2307/3008764
  6. R. T. Clemen, “Combining forecasts: A review and annotated bibliography,” Int. Journal of Forecasting, vol. 5, no. 4, 1989. http://dx.doi.org/10.1016/0169-2070(89)90012-5
  7. G. P. Zhang, “Time series forecasting using a hybrid arima and neural network model,” Neurocomputing, vol. 50, 2003. http://dx.doi.org/10.1016/s0925-2312(01)00702-0
  8. N. Liu, Q. Tang et al., “A hybrid forecasting model with parameter optimization for short-term load forecasting of micro-grids,” Applied Energy, vol. 129, 2014. http://dx.doi.org/10.1016/j.apenergy.2014.05.023
  9. F. Collopy and J. S. Armstrong, “Rule-based forecasting: Development and validation of an expert systems approach to combining time series extrapolations,” Management Science, vol. 38, no. 10, 1992. http://dx.doi.org/10.1287/mnsc.38.10.1394
  10. B. Arinze, S.-L. Kim, and M. Anandarajan, “Combining and selecting forecasting models using rule based induction,” Comp. & Oper. Research, vol. 24, no. 5, 1997. http://dx.doi.org/10.1016/s0305-0548(96)00062-7
  11. X. Wang, K. Smith-Miles, and R. Hyndman, “Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series,” Neurocomputing, vol. 72, no. 10-12, 2009. http://dx.doi.org/10.1016/j.neucom.2008.10.017
  12. M. Züfle, A. Bauer et al., “Autonomic forecasting method selection: examination and ways ahead,” in Proceedings of ICAC 2019. IEEE, 2019. http://dx.doi.org/10.1109/icac.2019.00028
  13. L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, 2001.
  14. T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of SIGKDD 2016, 2016. http://dx.doi.org/10.1145/2939672.2939785
  15. M. Züfle, A. Bauer et al., “Telescope: A Hybrid Forecast Method for Univariate Time Series,” in Proceedings of ITISE 2017, September 2017.
  16. A. Bauer, M. Züfle et al., “Telescope: An automatic feature extraction and transformation approach for time series forecasting on a level-playing field,” in Proceedings of ICDE 2020, April 2020. http://dx.doi.org/10.1109/icde48307.2020.00199