Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

Adding Linguistic Information to Transformer Models Improves Biomedical Event Detection?

,

DOI: http://dx.doi.org/10.15439/2023F2076

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 12111216 ()

Full text

Abstract. Biomedical event detection is an essential subtask of event extraction that identifies and classifies event triggers, indicating the possible construction of events. In this work we propose the comparison of BERT and four of its variants in the detection of biomedical events to evaluate and analyze the differences in their performance. The models are learned using seven manually annotated corpus in different biomedical subdomains and fine-tuned by adding a simple linear layer and a Bi-LSTM layer on top of the models. The evaluation is done by comparing the behavior of the original models and by adding a lexical and a syntactic feature. SciBERT emerged as the highest performing model when the fine-tuning is done using a Bi-LSTM layer and without need of extra feature. This result suggests that the use of a transformer model that is pretrained from scratch and uses biomedical and general data for its pretraining, allows to detect event triggers in the biomedical domain covering different subdomains.

References

  1. C. Shen, H. Lin, X. Fan, Y. Chu, Z. Yang, J. Wang, and S. Zhang, “Biomedical event trigger detection with convolutional highway neural network and extreme learning machine,” Applied Soft Computing, vol. 84, p. 105661, 2019. http://dx.doi.org/10.1016/j.asoc.2019.105661
  2. S. Cui, B. Yu, T. Liu, Z. Zhang, X. Wang, and J. Shi, “Event detection with relation-aware graph convolutional neural networks,” arXiv e-prints, pp. arXiv–2002, 2020.
  3. C. Zerva and S. Ananiadou, “Event extraction in pieces: Tackling the partial event identification problem on unseen corpora,” in Proceedings of BioNLP 15, 2015. http://dx.doi.org/10.18653/v1/W15-3804 pp. 31–41.
  4. R. Hanslo, “Deep learning transformer architecture for named-entity recognition on low-resourced languages: State of the art results,” in 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS). IEEE, 2022, pp. 53–60.
  5. K. Kaczmarek, J. Pokrywka, and F. Graliński, “Using transformer models for gender attribution in polish,” in 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS). IEEE, 2022, pp. 73–77.
  6. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint https://arxiv.org/abs/1810.04805, 2018. http://dx.doi.org/10.18653/v1/n19-1423
  7. J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020. http://dx.doi.org/10.1093/bioinformatics/btz682
  8. I. Beltagy, K. Lo, and A. Cohan, “Scibert: A pretrained language model for scientific text,” arXiv preprint https://arxiv.org/abs/1903.10676, 2019. http://dx.doi.org/10.18653/v1/D19-1371
  9. A. Erdengasileng, Q. Han, T. Zhao, S. Tian, X. Sui, K. Li, W. Wang, J. Wang, T. Hu, F. Pan et al., “Pre-trained models, data augmentation, and ensemble learning for biomedical information extraction and document classification,” Database, vol. 2022, 2022. http://dx.doi.org/10.1093/database/baac066
  10. P. V. Rahul, S. K. Sahu, and A. Anand, “Biomedical event trigger identification using bidirectional recurrent neural network based models,” arXiv preprint https://arxiv.org/abs/1705.09516, 2017. http://dx.doi.org/10.18653/v1/W17-2340
  11. S. Pyysalo, T. Ohta, M. Miwa, H.-C. Cho, J. Tsujii, and S. Ananiadou, “Event extraction across multiple levels of biological organization,” Bioinformatics, vol. 28, no. 18, pp. i575–i581, 2012. http://dx.doi.org/10.1093/bioinformatics/bts407
  12. S. Duan, R. He, and W. Zhao, “Exploiting document level information to improve event detection via recurrent neural networks,” in Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2017, pp. 352–361.
  13. Y. Zhao, X. Jin, Y. Wang, and X. Cheng, “Document embedding enhanced event detection with hierarchical and supervised attention,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018. http://dx.doi.org/10.18653/v1/P18-2066 pp. 414–419.
  14. T. H. Nguyen and R. Grishman, “Graph convolutional networks with argument-aware pooling for event detection,” in Thirty-second AAAI conference on artificial intelligence, 2018. http://dx.doi.org/10.1609/aaai.v32i1.12039
  15. H. Yan, X. Jin, X. Meng, J. Guo, and X. Cheng, “Event detection with multi-order graph convolution and aggregated attention,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019. http://dx.doi.org/10.18653/v1/D19-1582 pp. 5766–5770.
  16. H.-L. Trieu, T. T. Tran, K. N. Duong, A. Nguyen, M. Miwa, and S. Ananiadou, “Deepeventmine: end-to-end neural nested event extraction from biomedical texts,” Bioinformatics, vol. 36, no. 19, pp. 4910–4917, 2020. http://dx.doi.org/10.1093/bioinformatics/btaa540
  17. B. Portelli, E. Lenzi, E. Chersoni, G. Serra, and E. Santus, “Bert prescriptions to avoid unwanted headaches: A comparison of transformer architectures for adverse drug event detection,” in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021. http://dx.doi.org/10.18653/v1/2021.eacl-main.149 pp. 1740–1747.
  18. A. Ramponi, R. van der Goot, R. Lombardo, and B. Plank, “Biomedical event extraction as sequence labeling,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.431 pp. 5357–5367.
  19. Y. Chen, “A transfer learning model with multi-source domains for biomedical event trigger extraction,” BMC genomics, vol. 22, no. 1, pp. 1–18, 2021. http://dx.doi.org/10.1186/s12864-020-07315-1
  20. Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon, “Domain-specific language model pretraining for biomedical natural language processing,” ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021. http://dx.doi.org/10.1145/3458754
  21. S. Gururangan, A. Marasović, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A. Smith, “Don’t stop pretraining: Adapt language models to domains and tasks,” in Proceedings of ACL, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.740
  22. C. Nédellec, R. Bossy, J.-D. Kim, J.-J. Kim, T. Ohta, S. Pyysalo, and P. Zweigenbaum, “Overview of bionlp shared task 2013,” in Proceedings of the BioNLP shared task 2013 workshop, 2013, pp. 1–7.
  23. T. Ohta, S. Pyysalo, and J. Tsujii, “Overview of the epigenetics and post-translational modifications (epi) task of bionlp shared task 2011,” in Proceedings of BioNLP Shared Task 2011 Workshop, 2011, pp. 16–25.
  24. J.-D. Kim, Y. Wang, T. Takagi, and A. Yonezawa, “Overview of genia event task in bionlp shared task 2011,” in Proceedings of BioNLP shared task 2011 workshop, 2011, pp. 7–15.
  25. J.-D. Kim, Y. Wang, and Y. Yasunori, “The genia event extraction shared task, 2013 edition-overview,” in Proceedings of the BioNLP Shared Task 2013 Workshop, 2013, pp. 8–15.
  26. S. Pyysalo, T. Ohta, R. Rak, D. Sullivan, C. Mao, C. Wang, B. Sobral, J. Tsujii, and S. Ananiadou, “Overview of the infectious diseases (id) task of bionlp shared task 2011,” in Proceedings of BioNLP Shared Task 2011 Workshop, 2011, pp. 26–35.
  27. A. G. Jivani et al., “A comparative study of stemming algorithms,” Int. J. Comp. Tech. Appl, vol. 2, no. 6, pp. 1930–1938, 2011.
  28. S. Petrov, D. Das, and R. McDonald, “A universal part-of-speech tagset,” arXiv preprint https://arxiv.org/abs/1104.2086, 2011.