Forming Classifier Ensembles with Deterministic Feature Subspaces

Michał Koziarski, Bartosz Krawczyk, Michał Woźniak

DOI: http://dx.doi.org/10.15439/2016F552

Citation: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 8, pages 89–95 (2016)

Full text

Abstract. Ensemble learning is being considered as one of the most well-established and efficient techniques in the contemporary machine learning. The key to the satisfactory performance of such combined models lies in the supplied base learners and selected combination strategy. In this paper we will focus on the former issue. Having classifiers that are of high individual quality and complementary to each other is a desirable property. Among several ways to ensure diversity feature space division deserves attention. The most popular method employed here is Random Subspace approach. However, due to its random nature one cannot consider this approach as stable one or suitable for real-life applications. Therefore, we propose a new approach called Deterministic Subspace that constructs feature subspaces in a guided and repetitive manner. We present a general framework and three dedicated measures that can be used for selecting diverse and uncorrelated features for each base learner. This way we will always obtain identical sets of features, leading to creation of stable ensembles. Experimental study backed-up with statistical analysis prove the usefulness of our method in comparison to popular randomized solution.

References

M. Woźniak, M. Graña, and E. Corchado, “A survey of multiple classifier systems as hybrid systems,” Information Fusion, vol. 16, pp. 3–17, 2014.
S. Wang and X. Yao, “Relationships between diversity of classification ensembles and single-class performance measures,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 1, pp. 206–219, 2013.
W. M. Czarnecki, R. Józefowicz, and J. Tabor, “Maximum entropy linear manifold for learning discriminative low-dimensional representation,” in Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings, Part I, 2015, pp. 52–67.
T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 8, pp. 832–844, 1998.
T. Windeatt, “Accuracy/diversity and ensemble MLP classifier design,” IEEE Trans. Neural Networks, vol. 17, no. 5, pp. 1194–1211, 2006.
B. Krawczyk and M. Woźniak, “Untrained weighted classifier combination with embedded ensemble pruning,” Neurocomputing, vol. 196, pp. 14 – 22, 2016.
P. Trajdos and M. Kurzynski, “A dynamic model of classifier competence based on the local fuzzy confusion matrix and the random reference classifier,” Applied Mathematics and Computer Science, vol. 26, no. 1, p. 175, 2016.
L. Rokach, “Decision forest: Twenty years of research,” Information Fusion, vol. 27, pp. 111–125, 2016.
P. M. Álvarez, J. Luengo, and F. Herrera, “A first study on the use of boosting for class noise reparation,” in Hybrid Artificial Intelligent Systems - 11th International Conference, HAIS 2016, Seville, Spain, April 18-20, 2016, Proceedings, 2016, pp. 549–559.
B. Cyganek, “One-class support vector ensembles for image segmentation and classification,” Journal of Mathematical Imaging and Vision, vol. 42, no. 2-3, pp. 103–117, 2012.
J. Maudes, J. J. R. Diez, C. I. García-Osorio, and N. García-Pedrajas, “Random feature weights for decision tree ensemble construction,” Information Fusion, vol. 13, no. 1, pp. 20–30, 2012.
A. M. P. Canuto, K. M. O. Vale, A. F. Neto, and A. Signoretti, “Reinsel: A class-based mechanism for feature selection in ensemble of classifiers,” Appl. Soft Comput., vol. 12, no. 8, pp. 2517–2529, 2012.
K. Nag and N. R. Pal, “A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification,” IEEE Trans. Cybernetics, vol. 46, no. 2, pp. 499–510, 2016.
S. Özögür-Akyüz, T. Windeatt, and R. S. Smith, “Pruning of error correcting output codes by optimization of accuracy-diversity trade off,” Machine Learning, vol. 101, no. 1-3, pp. 253–269, 2015.
M. Galar, A. Fernández, E. Barrenechea, and F. Herrera, “DRCW-OVO: distance-based relative competence weighting combination for one-vs-one strategy in multi-class problems,” Pattern Recognition, vol. 48, no. 1, pp. 28–42, 2015.
I. T. Podolak and A. Roman, “Theoretical foundations and experimental results for a hierarchical classifier with overlapping clusters,” Computational Intelligence, vol. 29, no. 2, pp. 357–388, 2013.
T. Sun, L. Jiao, F. Liu, S. Wang, and J. Feng, “Selective multiple kernel learning for classification with ensemble strategy,” Pattern Recognition, vol. 46, no. 11, pp. 3081–3090, 2013.
R. E. Banfield, L. O. Hall, K. W. Bowyer, and W. P. Kegelmeyer, “Ensemble diversity measures and their application to thinning,” Information Fusion, vol. 6, no. 1, pp. 49–62, 2005.
E. Alpaydin, “Combined 5 x 2 cv F test for comparing supervised classification learning algorithms,” Neural Computation, vol. 11, no. 8, pp. 1885–1892, 1999.