Stereotype-aware collaborative filtering

Gabriel Frisch; Jean-Benoist Leger; Yves Grandvalet

Stereotype-aware collaborative filtering

Gabriel Frisch, Jean-Benoist Leger, Yves Grandvalet

DOI: http://dx.doi.org/10.15439/2021F117

Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 69–79 (2021)

Full text

Abstract. In collaborative filtering, recommendations are made using user feedback on a few products. In this paper, we show that even if sensitive attributes are not used to fit the models, a disparate impact may nevertheless affect recommendations. We propose a definition of fairness for the recommender system that expresses that the ranking of items should be independent of sensitive attribute. We design a co-clustering of users and items that processes exogenous sensitive attributes to remove their influence to return fair recommendations. We prove that our model ensures approximately fair recommendations provided that the classification of users approximately respects statistical parity.

References

Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. Fair clustering via equitable group representations. In Madeleine Clare Elish, William Isaac, and Richard S. Zemel, editors, ACM Conference on Fairness, Accountability, and Transparency (FAccT), pages 504–514, 2021.
Jean-Patrick Baudry and Gilles Celeux. EM for mixtures. Statistics and Computing, 25(4):713–726, 2015.
Suman K. Bera, Deeparnab Chakrabarty, Nicolas J. Flores, and Maryam Negahbani. Fair algorithms for clustering, 2019.
Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Li Wei, Yi Wu, Lukasz Heldt, Zhe Zhao, Lichan Hong, Ed H. Chi, and Cristos Goodrow. Fairness in Recommendation Ranking through Pairwise Comparisons, pages 2212—-2220. 2019.
Christophe Biernacki, Gilles Celeux, and Gérard Govaert. Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Computational Statistics & Data Analysis, 41:561–575, 2003.
Reuben Binns. Fairness in machine learning: Lessons from political philosophy. In Sorelle A. Friedler and Christo Wilson, editors, Proceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81 of Proceedings of Machine Learning Research, pages 149–159, New York, NY, USA, 23–24 Feb 2018. PMLR.
Vincent Brault and Mahendra Mariadassou. Co-clustering through latent bloc model: A review. Jour- nal de la Société Française de Statistique, 156(3):120–139, 2015.
Robin Burke, Nasim Sonboli, and Aldo Ordonez-Gauger. Balanced neighborhoods for multi-sided fairness in recommendation. In 1st Conference on Fairness, Accountability and Transparency, volume 81 of PMLR, pages 202–214, 2018.
Paul-Christian Bürkner and Matti Vuorre. Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1):77–101, 2019.
Anne R. Daykin and Peter G. Moffatt. Analyzing ordered responses: A review of the ordered probit model. Understanding Statistics, 1(3):157–166, 2002.
Thomas N. Daymonti and Paul J. Andrisani. Job preferences, college major, and the gender gap in earnings. Journal of Human Resources, 19(3):408–428, 1984.
Pratik Gajane. On formalizing fairness in prediction with machine learning. CoRR, abs/1710.03184, 2017.
Thomas George and Srujana Merugu. A scalable collaborative filtering framework based on co-clustering. In Fifth IEEE International Conference on Data Mining (ICDM), 2005.
Mehrdad Ghadiri, Samira Samadi, and Santosh Vempala. Socially fair k-means clustering. arXiv preprint https://arxiv.org/abs/2006.10085, 2020.
Gérard Govaert and Mohamed Nadif. Block clustering with Bernoulli mixture models: Comparison of different approaches. Computational Statistics & Data Analysis, 52(6):3233–3245, February 2008.
Gérard Govaert and Mohamed Nadif. Latent block model for contingency table. Communications in Statistics - Theory and Methods, 39(3):416–425, 2010.
Moritz Hardt, Eric Price, and Nathan Srebro. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems 29, pages 3315–3323, 2016.
F. Maxwell Harper and Joseph A. Konstan. The MovieLens datasets: History and context. ACM Trans. Interact. Intell. Syst., 5(4), December 2015.
Nicolas Hug. Surprise: A python library for recommender systems. Journal of Open Source Software, 5(52):2174, 2020.
Tommi S. Jaakkola. Tutorial on variational approximation methods. In Advanced Mean Field Methods: Theory and Practice, pages 129–159. MIT Press, 2000.
Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-softmax. arXiv preprint https://arxiv.org/abs/1611.01144, 2016.
Kalervo Järvelin and Jaana Kekäläinen. IR evaluation methods for retrieving highly relevant docu- ments. In ACM SIGIR Forum, volume 51, pages 243–250, 2017.
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. Recommendation independence. In Conference on Fairness, Accountability and Transparency, pages 187–201, 2018.
C. Keribin, V. Brault, G. Celeux, and G. Govaert. Estimation and selection for the latent block model on categorical data. Statistics and Computing, 25(6):1201–1216, 2015.
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint https://arxiv.org/abs/1412.6980, 2014.
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, Aug 2009.
Aurore Lomet, Gérard Govaert, and Yves Grandvalet. Model Selection for Gaussian Latent Block Clustering with the Integrated Classification Likelihood. Advances in Data Analysis and Classification, 12(3):489–508, 2018.
Benjamin M. Marlin, Richard S. Zemel, Sam T. Roweis, and Malcolm Slaney. Collaborative filtering and the missing at random assumption. In Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI), pages 267–275, 2007.
Benjamin M. Marlin, Richard S. Zemel, Sam T. Roweis, and Malcolm Slaney. Collaborative filtering and the missing at random assumption. CoRR, abs/1206.5267, 2012.
Rajesh Ranganath, Sean Gerrish, and David Blei. Black Box Variational Inference. In Samuel Kaski and Jukka Corander, editors, Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, volume 33 of Proceedings of Machine Learning Research, pages 814–822, Reykjavik, Iceland, 22–25 Apr 2014. PMLR.
Tim Räz. Group fairness: Independence revisited. arXiv preprint https://arxiv.org/abs/2101.02968, 2021.
Steffen Rendle, Li Zhang, and Yehuda Koren. On the difficulty of evaluating baselines: A study on recommender systems. arXiv preprint https://arxiv.org/abs/1905.01395, 2019.
Donald B. Rubin. Inference and missing data. Biometrika, 63(3):581–592, 1976.
Sirui Yao and Bert Huang. Beyond parity: Fairness objectives for collaborative filtering. CoRR, abs/1705.08804, 2017.
Ziwei Zhu, Xia Hu, and James Caverlee. Fairness-aware tensor-based recommendation. In 27th ACM International Conference on Information and Knowledge Management, pages 1153—-1162, 2018.