Training Subset Selection for Support Vector Regression
Cenru Liu, Jiahao Cen
DOI: http://dx.doi.org/10.15439/2019F363
Citation: Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 18, pages 11–14 (2019)
Abstract. As more and more data are available, training a machine learning model can be extremely intractable, especially for complex models like Support Vector Regression (SVR) train- ing of which requires solving a large quadratic programming optimization problem. Selecting a small data subset that can effectively represent the characteristic features of training data and preserve their distribution is an efficient way to solve this problem. This paper proposes a systematic approach to select the best representative data for SVR training. The distribution of both predictor and response variables are preserved in the selected subset via a 2-layer data clustering strategy. A 2-layer step-wise greedy algorithm is introduced to select best data points for constructing a reduced training set. The proposed method has been applied for predicting deck's win rates in the Clash Royale Challenge, in which 10 subsets containing hundreds of data examples were selected from 100k for training 10 SVR models to maximize their prediction performance evaluated using R-squared metric. Our final submission having a R2 score of 0.225682 won the 3rd place among over 1200 solutions submitted by 115 teams
References
- B. Marr, "How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read," https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/ae77e2560ba9, 2018.
- B.E. Boser, I.M. Guyon, V. Vapnik, "A training algorithm for optimal margin classifiers," Proceedings of the Annual Conference on Computational Learning Theory, ACM, pp. 144–152, Pittsburgh, PA 1992.
- I. Guyon, B. Boser, and V. Vapnik, "Automatic capacity tuning of very large VC-dimension classifiers," Advances in Neural Information Processing Systems 5, pp. 147–155, Morgan Kaufmann Publishers, 1993.
- C. Cortes, and V. Vapnik, Support vector networks, Machine Learning, vol. 20, pp. 273–297, 1995.
- B. Schölkopf, C. Burges, and V. Vapnik, "Extracting support data for a given task," Proceedings of First International Conference on Knowledge Discovery and Data Mining, AAAI Press, 1995.
- B. Schölkopf, C. Burges, and V. Vapnik, "Incorporating invariances in support vector learning machines," Artificial Neural Networks, Springer Lecture Notes in Computer Science, Vol. 1112, pp. 47–52, Berlin, 1996.
- V. Vapnik, S. Golowich and A. Smola, “Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing,” in M. Mozer, M. Jordan, and T. Petsche (eds.), Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, MA., 1997.
- V. Vapnik and A. Chervonenkis, “Theory of Pattern Recognition” (in Russian), Nauka, 1974.
- V. Vapnik, “Estimation of dependences based on empirical data,” Springer Verlag.
- V. Vapnik, "The Nature of Statistical Learning Theory," Springer, New York.
- B. Schölkopf, P. Simard, A. Smola, and V. Vapnik, "Prior knowledge in support vector kernels," In: M.I. Jordan, M.J. Kearns, and S.A. Solla (Eds.), Advances in Neural Information Processing Systems 10, MIT Press, Cambridge, MA, pp. 640–646, 1998.
- V. Blanz, B. Schölkopf, H. Bulthoff, C. Burges, V. Vapnik, and T. Vetter, "Comparison of view-based object recognition algorithms using realistic 3D models," Artificial Neural Networks, Springer Lecture Notes in Computer Science, vol. 1112, pp. 251–256, Berlin, 1996.
- B. Schölkopf, K. Sung, C. Burges, F. Girosi, P. Niyogi, T. Poggio, and V. Vapnik, "Comparing support vector machines with Gaussian kernels to radial basis function classifiers," IEEE Transactions on Signal Processing, vol. 45, pp. 2758–2765, 1997.
- K.R. Muller, A. Smola, G. Ratsch, B. Schölkopf, J. Kohlmorgen, and V. Vapnik, "Predicting time series with support vector machines," Artificial Neural Networks, Springer Lecture Notes in Computer Science, vol. 1327, pp. 999–1004, Berlin, 1997.
- H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, and V. Vapnik, "Support vector regression machines," Advances in Neural Information Processing Systems 9, pp. 155–161, MIT Press, Cambridge, MA, 1997.
- M. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, and J. Weston, "Support vector regression with ANOVA decomposition kernels," Advances in Kernel Methods—Support Vector Learning, MIT Press Cambridge, MA, pp. 285–292, 1999.
- A. Smola, and B. Schölkopf, "A Tutorial on Support Vector Regression," STATISTICS AND COMPUTING, vol. 14, pp. 199-222, 2003.
- D. Basak, S. Pal, and D. Patranabis, "Support Vector Regression," Neural Information Processing – Letters and Reviews, vol. 11, Non. 10, pp. 203-224, October 2007.
- X. Xia,M. Lyu, T. Lok, G. Huang, "Methods of Decreasing the Number of Support Vectors via k-Mean Clustering," Proc. International Conference on Intelligent Computing, Lecture Notes in Computer Science book series (LNCS),vol. 3644 pp. 717-726, 2005.
- J. Hartigan, and M. Wong, "Algorithm AS 136: A k-Means Clustering Algorithm," Journal of the Royal Statistical Society, Series C, vol. 28, no. 1, pp. 100–108, 1979.
- fitrsvm: Fit a support vector machine regression mode, https://www.mathworks.com/help/stats/fitrsvm.html.