Abbreviation Disambiguation in Polish Press News Using Encoder-Decoder Models

Krzysztof Wróbel; Jakub Karbowski; Paweł Lewkowicz

Abbreviation Disambiguation in Polish Press News Using Encoder-Decoder Models

Krzysztof Wróbel, Jakub Karbowski, Paweł Lewkowicz

DOI: http://dx.doi.org/10.15439/2023F839

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 1255–1264 (2023)

Full text

Abstract. The disambiguation of abbreviations and acronyms is a longstanding problem in Natural Language Processing (NLP) that has garnered significant attention from researchers. Previous approaches have employed statistical methods, semantic similarity metrics, and machine learning algorithms. Various languages and document types have been explored, with English being the most commonly studied language. This paper presents a comprehensive review of research efforts in abbreviation disambiguation, encompassing different languages, document types, and methodologies employed. Recent studies have showcased the effectiveness of pre-trained encoder-decoder models, while the emergence of multilingual models has enabled tackling abbreviation disambiguation across multiple languages. Standardization and addressing the challenges of multilingual and multi-document type disambiguation remain ongoing goals in the field of NLP. This paper provides valuable insights into the current state-of-the-art approaches and identifies future research directions. The methods are evaluated in the context of the PolEval abbreviation disambiguation competition, where the authors achieve top ranking.

References

A. Terada, T. Tokunaga, and H. Tanaka, “Automatic expansion of abbreviations by using context and character information,” Information Processing Management, vol. 40, no. 1, pp. 31–45, 2004. doi: https://doi.org/10.1016/S0306-4573(02)00080-8. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0306457302000808
Y. HaCohen-Kerner, A. Kass, and A. Peretz, “Abbreviation disambiguation: Experiments with various variants of the one sense per discourse hypothesis,” in Natural Language and Information Systems, 13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008, London, UK, June 24-27, 2008, Proceedings, ser. Lecture Notes in Computer Science, E. Kapetanios, V. Sugumaran, and M. Spiliopoulou, Eds., vol. 5039. Springer, 2008. http://dx.doi.org/10.1007/978-3-540-69858-6_5 pp. 27–39. [Online]. Available: https://doi.org/10.1007/978-3-540-69858-6_5
Y. HaCohen-Kerner, A. Kass, and A. Peretz, “Combined one sense disambiguation of abbreviations,” in Proceedings of ACL-08: HLT, Short Papers, 2008, pp. 61–64.
Y. Wu, J. Xu, Y. Zhang, and H. Xu, “Clinical abbreviation disambiguation using neural word embeddings,” in Proceedings of BioNLP 15, 2015, pp. 171–176.
A. M. M. Jaber and P. Martínez Fernández, “Disambiguating clinical abbreviations using pre-trained word embeddings,” 2021.
L. Zhang, L. Li, H. Wang, and X. Sun, “Predicting Chinese abbreviations with minimum semantic unit and global constraints,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014. http://dx.doi.org/10.3115/v1/D14-1147 pp. 1405–1414. [Online]. Available: https://aclanthology.org/D14-1147
A. Berdichevskaia, “Atypical lexical abbreviations identification in russian medical texts,” 2022 12th International Conference on Pattern Recognition Systems (ICPRS), pp. 1–5, 2022.
A. Mykowiecka and M. Marciniak, “Experiments with ad hoc ambiguous abbreviation expansion,” in Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019). Hong Kong: Association for Computational Linguistics, Nov. 2019. http://dx.doi.org/10.18653/v1/D19-6207 pp. 44–53. [Online]. Available: https://aclanthology.org/D19-6207
R. Kai and W. Shi-Wen, “Applying convolutional neural network model and auto-expanded corpus to biomedical abbreviation disambiguation.” Journal of Engineering Science & Technology Review, vol. 9, no. 6, 2016.
Q. Zhong, G. Zeng, D. Zhu, Y. Zhang, W. Lin, B. Chen, and J. Tang, “Leveraging domain agnostic and specific knowledge for acronym disambiguation,” CoRR, vol. abs/2107.00316, 2021. [Online]. Available: https://arxiv.org/abs/2107.00316
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 5485–5551, 2020.
A. Rajkomar, E. Loreaux, Y. Liu, J. Kemp, B. Li, M.-J. Chen, Y. Zhang, A. Mohiuddin, and J. Gottweis, “Deciphering clinical abbreviations with a privacy protecting machine learning system,” Nature Communications, vol. 13, no. 1, p. 7456, 2022.
G. Song, H. Lee, and K. Shim, “T5 encoder based acronym disambiguation with weak supervision,” SDU@ AAAI-22, 2022.
J. L. Pereira, J. Casanova, H. Galhardas, and D. Shasha, “Acx: system, techniques, and experiments for acronym expansion,” Proceedings of the VLDB Endowment, vol. 15, no. 11, pp. 2530–2544, 2022.
G. Wenzek, M.-A. Lachaux, A. Conneau, V. Chaudhary, F. Guzmán, A. Joulin, and E. Grave, “CCNet: Extracting high quality monolingual datasets from web crawl data,” in Proceedings of the Twelfth Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, May 2020. ISBN 979-10-95546-34-4 pp. 4003–4012. [Online]. Available: https://aclanthology.org/2020.lrec-1.494
W. Kieraś and M. Woliński, “Morfeusz 2 – analizator i generator fleksyjny dla j ̨ezyka polskiego,” Język Polski, vol. XCVII, no. 1, pp. 75–83, 2017.
M. Honnibal, I. Montani, S. Van Landeghem, and A. Boyd, “spaCy: Industrial-strength Natural Language Processing in Python,” 2020. http://dx.doi.org/10.5281/zenodo.1212303
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” CoRR, vol. abs/1910.10683, 2019. [Online]. Available: http://arxiv.org/abs/1910.10683
L. Xue, A. Barua, N. Constant, R. Al-Rfou, S. Narang, M. Kale, A. Roberts, and C. Raffel, “Byt5: Towards a token-free future with pre-trained byte-to-byte models,” CoRR, vol. abs/2105.13626, 2021. [Online]. Available: https://arxiv.org/abs/2105.13626
A. Chrabrowa, Ł. Dragan, K. Grzegorczyk, D. Kajtoch, M. Koszowski, R. Mroczkowski, and P. Rybak, “Evaluation of transfer learning for Polish with a text-to-text model,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, Jun. 2022, pp. 4374–4394. [Online]. Available: https://aclanthology.org/2022.lrec-1.466