Efficient Support Vector Regression with Reduced Training Data

Ling Cen; Quang Hieu Vu; Dymitr Ruta

Efficient Support Vector Regression with Reduced Training Data

Ling Cen, Quang Hieu Vu, Dymitr Ruta

DOI: http://dx.doi.org/10.15439/2019F362

Citation: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 18, pages 15–18 (2019)

Full text

Abstract. Support Vector Regression (SVR) as a supervised machine learning algorithm have gained popularity in various fields. However, the quadratic complexity of the SVR in the number of training examples prevents it from many practical applications with large training datasets. This paper aims to explore efficient ways that maximize prediction accuracy of the SVR at the minimum number of training examples. For this purpose, a clustered greedy strategy and a Genetic Algorithm (GA) based approach are proposed for optimal subset selection. The performance of the developed methods has been illustrated in the context of Clash Royale Challenge 2019, concerned with decks' win rate prediction. The training dataset with 100,000 examples were reduced to hundreds, which were fed to SVR training to maximize model prediction performance measured in validation R2 score. Our approach achieved the second highest score among over hundred participating teams in this challenge.

References

B. Boser, I. Guyon, and V. Vapnik, "A training algorithm for optimal margin classifiers," Proc. Fifth Annual Workshop of Computational Learning Theory, vol. 5, pp. 144–152, Pittsburgh, 1992.
V. Vapnik, "The Nature of Statistical Learning Theory," Springer, New York, 1995.
V. Vapnik, S. Golowich and A. Smola, “Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing,” in M. Mozer, M. Jordan, and T. Petsche (eds.), Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, MA., 1997.
A. Smola, and B. Schölkopf, "A Tutorial on Support Vector Regression," Statistics and computing, vol. 14, pp. 199-222, 2003.
X. Xia,M. Lyu, T. Lok, G. Huang, "Methods of Decreasing the Number of Support Vectors via k-Mean Clustering," Proc. Int. Conf. Intelligent Computing, pp. 717-726, 2005.
C. Burges, "Simplified support vector decision rules," Proc. 13th Int. Conf. Mach. Learning, pp. 71-77, 1996.
E. Osuna and F. Girosi, "Reducing the run-time complexity of support vector machines," Int. Conf. Pattern Recognition, Australia, 1998.
D. Geebelen, J. Suykens, J. Vandewalle, "Reducing the number of support vectors of SVM classifiers using the smoothed separable case approximation," IEEE Trans Neural Netw Learn Syst., vol. 23, no. 4, pp. 682-688, 2012.
T. Downs, K. Gates, and A. Masters, "Exact simplification of support vector solutions," Journal of Machine Learning Reseaerch, vol. 1, pp. 293-297, 2001.
G. Bakir, J. Weston, and L. Bottou, "Breaking SVM complexity with cross-training," Advances in Neural Information Processing Systems, vol. 17, pp. 81-88, 2005.