Evaluation of Neural Network Transformer Models for Named-Entity Recognition on Low-Resourced Languages
Ridewaan Hanslo
DOI: http://dx.doi.org/10.15439/2021F7
Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 115–119 (2021)
Abstract. In this paper, transformer models are used to evaluate ten low-resourced South African languages for NER. Further, these transformer models are compared to bi-LSTM-aux and CRF models. The transformer models have the highest F-score of 84%. This result is significant within the context of the study, as previous research could not achieve F-scores of 80%. However, the CRF and bi-LSTM-aux models remain top performers in sequence tagging. Transformer models are viable for low-resourced languages. Future research could improve upon these findings by implementing a linear-complexity recurrent transformer variant.
References
- M. Loubser, and M. J. Puttkammer, “Viability of neural networks for core technologies for resource-scarce languages”. Information, Switzerland, 2020. https://doi.org/10.3390/info11010041
- A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov, Unsupervised Cross-lingual Representation Learning at Scale, 2020. https://doi.org/10.18653/v1/2020.acl-main.747
- B. Plank, A. Søgaard, and Y. Goldberg, “Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss”. 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers, 2016. https://doi.org/10.18653/v1/p16-2067
- G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, “Neural architectures for named entity recognition”. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference, 2016. https://doi.org/10.18653/v1/n16-1030
- J. Lafferty, A. McCallum, and C. N. F. Pereira, “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, 2001 https://doi.org/10.29122/mipi.v11i1.2792
- E. D. Liddy, “Natural Language Processing. In Encyclopedia of Library and Information Science”. In Encyclopedia of Library and Information Science, 2001.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need”. Advances in Neural Information Processing Systems, 2017.
- M. A. Hedderich, D. Adelani, D. Zhu, J. Alabi, U. Markus, and D. Klakow, Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages, 2020. https://doi.org/10.18653/v1/2020.emnlp-main.204
- R. Eiselen, “Government domain named entity recognition for South African languages”. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, 2016.
- T. Pires, E. Schlinger, and D. Garrette, “How multilingual is multilingual BERT?” ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 2020. https://doi.org/10.18653/v1/p19-1493
- A. Conneau, and G. Lample, “Cross-lingual language model pretraining”. Advances in Neural Information Processing Systems, 2019.
- M. Sokolova, and G. Lapalme, “A systematic analysis of performance measures for classification tasks”. Information Processing and Management, 45(4), 2009. https://doi.org/10.1016/j.ipm.2009.03.002