Named Entity Recognition System for the Biomedical Domain
Raghav Sharma, Deependra Singh, Raksha Sharma
DOI: http://dx.doi.org/10.15439/2022F63
Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 837–840 (2022)
Abstract. The recent advancements in medical science have caused a considerable acceleration in the rate at which new information is being published. The MEDLINE database is growing at 500,000 new citations each year. As a result of this exponential increase, it is not easy to manually keep up with this increasing swell of information. Thus, there is a need for automatic information extraction systems to retrieve and organize information in the biomedical domain. Biomedical Named Entity Recognition is one such fundamental information extraction task, leading to significant information management goals in the biomedical domain. Due to the complex vocabulary (e.g., mRNA) and free nomenclature (e.g., IL2), identifying named entities in the biomedical domain is more challenging than any other domain, hence requires special attention. In this paper, we deploy two novel bi-directional encoder-based systems, viz., BioBERT and RoBERTa to identify named entities in the biomedical text. Due to the domain-specific training of BioBERT, it gives reasonably good performance for the NER task in the biomedical domain. However, the structure of RoBERTa makes it more suitable for the task. We obtain a significant improvement in F-score by RoBERTa over BioBERT. In addition, we present a comparative study on training loss attained with ADAM and LAMB optimizers.
References
- S. Ananiadou and J. Mcnaught, Text mining for biology and biomedicine. Citeseer, 2006.
- K. B. Cohen and L. Hunter, “Getting started in text mining,” PLoS computational biology, vol. 4, no. 1, 2008.
- J. Lee, W. Yoon, and S. Kim, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, 2019.
- U. Leser and J. Hakenberg, “What makes a gene name? named entity recognition in the biomedical literature,” Briefings in Bioinformatics, vol. 6, no. 4, p. 357–369, 2005.
- M. Habibi, L. Weber, M. Neves, D. L. Wiegandt, and U. Leser, “Deep learning with word embeddings improves biomedical named entity recognition,” Bioinformatics, vol. 33, no. 14, pp. i37–i48, 2017.
- S. Eltyeb and N. Salim, “Chemical named entities recognition: a review on approaches and applications,” Journal of cheminformatics, vol. 6, no. 1, pp. 1–12, 2014.
- D. M. Bikel, R. Schwartz, and R. M. Weischedel, “An algorithm that learns what’s in a name,” Machine Learning, vol. 34, no. 1-3, pp. 211–231, 1999.
- G. Szarvas, R. Farkas, and A. Kocsor, “A multilingual named entity recognition system using boosting and c4.5 decision tree learning algorithms,” Discovery Science Lecture Notes in Computer Science, p. 267–278, 2006.
- P. Mcnamee and J. Mayfield, “Entity extraction without language-specific resources,” proceeding of the 6th conference on Natural language learning - COLING-02, 2002.
- A. Liu, J. Du, and V. Stoyanov, “Knowledge-augmented language model and its application to unsupervised named-entity recognition,” Proceedings of the 2019 Conference of the North, 2019.
- Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for sequence tagging,” 2015, cite arxiv:1508.01991. [Online]. Available: http://arxiv.org/abs/1508.01991
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- A. Radford and K. Narasimhan, “Improving language understanding by generative pre-training,” 2018.
- J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, vol. abs/1810.04805, 2018. [Online]. Available: http://arxiv.org/abs/1810.04805
- S. Pyysalo, F. Ginter, H. Moen, T. Salakoski, and S. Ananiadou, “Distributional semantics resources for biomedical text processing in proceedings of lbm. 2013,” Google Scholar, pp. 39–44, 2013.
- J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation.” in EMNLP, vol. 14, 2014, pp. 1532–1543.
- J. P. C. Chiu and E. Nichols, “Named entity recognition with bidirectional lstm-cnns,” 2015, cite arxiv:1511.08308Comment: To appear in Transactions of the Association for Computational Linguistics. [Online]. Available: http://arxiv.org/abs/1511.08308
- Z. Yang, D. Yang, C. Dyer, X. He, A. J. Smola, and E. H. Hovy, “Hierarchical attention networks for document classification.” in HLT-NAACL, K. Knight, A. Nenkova, and O. Rambow, Eds. The Association for Computational Linguistics, 2016, pp. 1480–1489. [Online]. Available: http://dblp.uni-trier.de/db/conf/naacl/naacl2016.html#YangYDHSH16
- Ö. Uzuner, B. R. South, S. Shen, and S. L. DuVall, “2010 i2b2/va challenge on concepts, assertions, and relations in clinical text,” Journal of the American Medical Informatics Association, vol. 18, no. 5, pp. 552–556, 2011.
- S. Pawar, R. Sharma, G. K. Palshikar, P. Bhattacharyya, and V. Varma, “Cause–effect relation extraction from documents in metallurgy and materials science,” Transactions of the Indian Institute of Metals, vol. 72, no. 8, pp. 2209–2217, 2019.
- R. Sharma and G. Palshikar, “Virus causes flu: Identifying causality in the biomedical domain using an ensemble approach with target-specific semantic embeddings,” in International Conference on Applications of Natural Language to Information Systems. Springer, 2021, pp. 93–104.
- J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
- Y. You, J. Li, S. Reddi, J. Hseu, S. Kumar, S. Bhojanapalli, X. Song, J. Demmel, K. Keutzer, and C.-J. Hsieh, “Large batch optimization for deep learning: Training bert in 76 minutes,” 2020.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint https://arxiv.org/abs/1810.04805, 2018.