A new way for the exploration of a dataset based on a social choice inspired approach

Michel Herbin, Amine Aït Younes, Frédéric Blanchard

DOI: http://dx.doi.org/10.15439/2016F453

Citation: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 8, pages 41–46 (2016)

Full text

Abstract. The exploration of a data set consists in grouping similar data. The classical statistical methods often fail when there is neither a minimal assumption on the clusters. Our approach is based on links between data, but the pairwise comparison between data and the importance of the links depend heavily on context where data lies. We propose to analyze a dataset through methods of the social choice theory where data plays both the role of a candidate and the role of a voter. The candidates are ranked by the voters and each voter gives a score to each candidate according to his ranking. We propose one specific election for each voter based on his preferences . The voters of these elections have weights computed on basis of the similarity of behavior between voters. In this approach, the conventional similarity indices between data are used to define the electoral behavior of each data.

References

A. Aı̈t Younes, F. Blanchard and M. Herbin, “New similarity index based on the aggregation of membership functions through OWA operator”, Federated Conference on Computer Science and Information Systems, FedCSIS 2015, 163–168, Lódz, Poland, 2015.
K. Bache, M. Lichman, “UCI Machine learning repository”, http://archive.ics.uci.edu/ml, University of California, Irvine, School of Information and Computer Sciences, 2013.
J.P. Barthélémy and B.Montjardet, “The median procedure in cluster analysis and social choice theory”, Mathematical Social Sciences, 1:235–267, 1981.
A. Bellet, A. Habrard, M. Sebban, “A Survey on Metric Learning for Feature Vectors and Structured Data”, Technical report, https://arxiv.org/abs/1306.6709, 2014.
F. Blanchard, C. de Runz, M. Herbin, H. Akdag, ”Représentativité et graphe de représentants : une approche inspirée de la théorie du choix social pour la fouille de données relationnelles”, Atelier Fouille de Données Complexes, Conférence Extraction et Gestion des Connaissances, EGC, 73-83, Brest, France, 2011.
M. de Borda, “Memoire sur les elections au scrutin”, Academie Royale des Sciences, Paris, 1784.
A.K. Jain, M.N. Murty, P.J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, 31(3), 264–323, 1999.
A.K. Jain, “Data clustering: 50 years beyond K-means”, Pattern Recognition Letters, 31, 651–666, 2010.
J.N. Mordeson, D.S. Malik, T.D. Clark, “Application of Fuzzy Logic to Social Choice Theory”, Chapman and Hall/CRC, 2015.
M. Parsapoor, U. Bilstrup, “An Emotional Learning-inspired Ensemble Classifier (ELiEC)”, Proceedings of the 2013 Federated Conference on Computer Science and Information Systems, 137-141, 2013
C. Spearman, “General intelligence objectively determined and measured”, Am J Psychol, 15, 201-293, 1904.