Mitigating the effects of non-IID data in federated learning with a self-adversarial balancing method

Anastasiya Danilenka

Mitigating the effects of non-IID data in federated learning with a self-adversarial balancing method

Anastasiya Danilenka

DOI: http://dx.doi.org/10.15439/2023F6549

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 925–930 (2023)

Full text

Abstract. Federated learning (FL) allows multiple devices to jointly train a global model without sharing local data. One of its problems is dealing with unbalanced data. Hence, a novel technique, designed to deal with label-skewed non-IID data, using adversarial inputs is proposed. Application of the proposed algorithm results in faster, and more stable, global model performance at the beginning of the training. It also delivers better final accuracy and decreases the discrepancy between the performance of individual classes. Experimental results, obtained for MNIST, EMNIST, and CIFAR-10 datasets, are reported and analyzed.

References

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y. Arcas, “Communication-Efficient Learning of Deep Networks from Decentralized Data,” in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, A. Singh and J. Zhu, Eds., vol. 54. PMLR, 20–22 Apr 2017, pp. 1273–1282.
H. Zhu, J. Xu, S. Liu, and Y. Jin, “Federated learning on non-iid data: A survey,” Neurocomputing, vol. 465, pp. 371–390, 2021. https://doi.org/10.1016/j.neucom.2021.07.098
Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
G. Cohen, S. Afshar, J. Tapson, and A. van Schaik, “Emnist: Extending mnist to handwritten letters,” in 2017 International Joint Conference on Neural Networks (IJCNN), 2017. http://dx.doi.org/10.1109/IJCNN.2017.7966217 pp. 2921–2926.
A. Krizhevsky, “Learning multiple layers of features from tiny images,” https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf, 2009.
T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith, “Federated optimization in heterogeneous networks,” in Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, March 2-4, 2020, I. S. Dhillon, D. S. Papailiopoulos, and V. Sze, Eds. mlsys.org, 2020.
S. P. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh, “SCAFFOLD: Stochastic controlled averaging for federated learning,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119. PMLR, 13–18 Jul 2020, pp. 5132–5143.
E. Ozfatura, K. Ozfatura, and D. Gündüz, “Fedadc: Accelerated federated learning with drift control,” in 2021 IEEE International Symposium on Information Theory (ISIT). IEEE Press, 2021. http://dx.doi.org/10.1109/ISIT45174.2021.9517850 p. 467–472.
L. Zhang, L. Shen, L. Ding, D. Tao, and L.-Y. Duan, “Fine-tuning global model via data-free knowledge distillation for non-iid federated learning,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. http://dx.doi.org/10.1109/CVPR52688.2022.00993 pp. 10 164–10 173.
M. Tang, X. Ning, Y. Wang, Y. Wang, and Y. Chen, “Fedgp: Correlationbased active client selection for heterogeneous federated learning,” 03 2021.
T. Yoon, S. Shin, S. J. Hwang, and E. Yang, “Fedmix: Approximation of mixup under mean augmented federated learning,” in International Conference on Learning Representations, 2021.
L. Wang, S. Xu, X. Wang, and Q. Zhu, “Addressing class imbalance in federated learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 11, pp. 10 165–10 173, May 2021. http://dx.doi.org/10.1609/aaai.v35i11.17219
C. Chen, Y. Liu, X. Ma, and L. Lyu, “Calfat: Calibrated federated adversarial training with label skewness,” 2023.
Y. Lu, P. Qian, G. Huang, and H. Wang, “Personalized federated learning on long-tailed data via adversarial feature augmentation,” in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023. http://dx.doi.org/10.1109/ICASSP49357.2023.10097083 pp. 1–5.
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” 2013.
O. Suciu, R. Marginean, Y. Kaya, H. D. III, and T. Dumitras, “When does machine learning FAIL? generalized transferability for evasion and poisoning attacks,” in 27th USENIX Security Symposium (USENIX Security 18). Baltimore, MD: USENIX Association, Aug. 2018. ISBN 978-1-939133-04-5 pp. 1299–1316.
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2015.
A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” 2017.
Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018. doi: 10.1109/CVPR.2018.00957 pp. 9185–9193.
S. Kullback and R. A. Leibler, “On Information and Sufficiency,” The Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79 – 86, 1951. http://dx.doi.org/10.1214/aoms/1177729694
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. http://dx.doi.org/10.1109/5.726791
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mobilenetv2: Inverted residuals and linear bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi: 10.1109/CVPR.2018.00474 pp. 4510–4520.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. http://dx.doi.org/10.1109/CVPR.2009.5206848 pp. 248–255.
D. Rey and M. Neuhäuser, Wilcoxon-Signed-Rank Test. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 1658–1659. ISBN 978-3-642-04898-2