Query Specific Focused Summarization of Biomedical Journal Articles

Akshara Rai; Suyash Sangwan; Tushar Goel; Ishan Verma; Lipika Dey

Query Specific Focused Summarization of Biomedical Journal Articles

Akshara Rai, Suyash Sangwan, Tushar Goel, Ishan Verma, Lipika Dey

DOI: http://dx.doi.org/10.15439/2021F128

Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 91–100 (2021)

Full text

Abstract. During COVID-19, a large repository of relevant literature, termed as``CORD-19'', was released by Allen Instituteof AI. The repository being very large, and growing exponentially, concerned users are struggling to retrieve only required information from the documents. In this paper, we present a framework for generating focused summaries of journal articles. The summary is generated using a novel optimization mechanism to ensure that it definitely contains all essential scientific content. The parameters for summarization are drawn from the variables that are used for reporting scientific studies. We have evaluated our results on the CORD-19 dataset. The approach however is generic.

References

J. Yang, Y. Zheng, X. Gou, K. Pu, Z. Chen, Q. Guo, R. Ji, H. Wang, Y. Wang, and Y. Zhou, “Prevalence of comorbidities and its effects in patients infected with sars-cov-2: a systematic review and meta-analysis,” International Journal of Infectious Diseases, vol. 94, pp. 91–95, 2020.
H. Nishiura, S.-m. Jung, N. M. Linton, R. Kinoshita, Y. Yang, K. Hayashi, T. Kobayashi, B. Yuan, and A. R. Akhmetzhanov, “The extent of transmission of novel coronavirus in wuhan, china, 2020,” 2020.
M. Neumann, D. King, I. Beltagy, and W. Ammar, “Scispacy: Fast and robust models for biomedical naturalanguage processing,” arXiv preprint https://arxiv.org/abs/1902.07669, 2019.
T. Dasgupta, I. Mondal, A. Naskar, and L. Dey, “Extracting semantic aspects for structured representation of clinical trial eligibility criteria,” in Proceedings of the 3rd Clinical Natural Language Processing Workshop. Online: Association for Computational Linguistics, Nov. 2020. http://dx.doi.org/10.18653/v1/2020.clinicalnlp-1.27 pp. 243–248. [Online]. Available: https://www.aclweb.org/anthology/2020.clinicalnlp-1.27
M. Honnibal and I. Montani, “spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing,” 2017, to appear.
A. Conneau, D. Kiela, H. Schwenk, L. Barrault, and A. Bordes, “Supervised learning of universal sentence representations from natural language inference data,” arXiv preprint https://arxiv.org/abs/1705.02364, 2017.
Z. Huang, W. Xu, and K. Yu, “Bidirectional lstm-crf models for sequence tagging,” arXiv preprint https://arxiv.org/abs/1508.01991, 2015.
Y. Zhang, Y. Cui, M. Shen, J. Zhang, B. Liu, M. Dai, L. Chen, D. Han, Y. Fan, Y. Zeng et al., “Association of diabetes mellitus with disease severity and prognosis in covid-19: a retrospective cohort study,” Diabetes research and clinical practice, vol. 165, p. 108227, 2020.
N. Zaki and E. A. Mohamed, “The estimations of the covid-19 incubation period: a systematic review of the literature,” medRxiv, 2020.
J. Carbonell and J. Goldstein, “The use of mmr, diversitybased reranking for reordering documents and producing summaries,” in Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, 1998, pp. 335–336.
C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
J. Steinberger and K. Jezek, “Using latent semantic analysis in text summarization and summary evaluation,” Proc. ISIM, vol. 4, pp. 93–100, 2004.
R. Mihalcea and P. Tarau, “Textrank: Bringing order into text,” in Proceedings of the 2004 conference on empirical methods in natural language processing, 2004, pp. 404–411.
L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.” Stanford InfoLab, Tech. Rep., 1999.
F. Barrios, F. López, L. Argerich, and R. Wachenchauzer, “Variations of the similarity function of textrank for automated summarization,” arXiv preprint https://arxiv.org/abs/1602.03606, 2016.
S. Robertson and H. Zaragoza, The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc, 2009.
H. Oliveira, R. Lima, R. D. Lins, F. Freitas, M. Riss, and S. J. Simske, “A concept-based integer linear programming approach for single-document summarization,” in 2016 5th Brazilian Conference on Intelligent Systems (BRACIS). IEEE, 2016, pp. 403–408.
R. Smith, “Strategies for coping with information overload,” 2010.
F. Davidoff and J. Miglus, “Delivering clinical evidence where it’s needed: building an information system worthy of the profession,” Jama, vol. 305, no. 18, pp. 1906–1907, 2011.
H. Zhang, M. Fiszman, D. Shin, C. M. Miller, G. Rosemblat, and T. C. Rindflesch, “Degree centrality for semantic abstraction summarization of therapeutic studies,” Journal of biomedical informatics, vol. 44, no. 5, pp. 830–838, 2011.
M. Fiszman, T. C. Rindflesch, and H. Kilicoglu, “Summarizing drug information in medline citations,” in AMIA Annual Symposium Proceedings, vol. 2006. American Medical Informatics Association, 2006, p. 254.
H. Moen, L.-M. Peltonen, J. Heimonen, A. Airola, T. Pahikkala, T. Salakoski, and S. Salanterä, “Comparison of automatic summarisation methods for clinical free text notes,” Artificial intelligence in medicine, vol. 67, pp. 25–37, 2016.
R. Pivovarov and N. Elhadad, “Automated methods for the summarization of electronic health records,” Journal of the American Medical Informatics Association, vol. 22, no. 5, pp. 938–947, 2015.
L. Plaza, A. Díaz, and P. Gervás, “A semantic graphbased approach to biomedical summarisation,” Artificial intelligence in medicine, vol. 53, no. 1, pp. 1–14, 2011.
M. Moradi, “Cibs: A biomedical text summarizer using topic-based sentence clustering,” Journal of biomedical informatics, vol. 88, pp. 53–61, 2018.
A. Sarker, Y.-C. Yang, M. A. Al-Garadi, and A. Abbas, “A light-weight text summarization system for fast access to medical evidence,” Frontiers in Digital Health, vol. 2, p. 45, 2020. http://dx.doi.org/10.3389/fdgth.2020.585559. [Online]. Available: https://www.frontiersin.org/article/10.3389/fdgth.2020.585559
N. Rahman and B. Borah, “Improvement of query-based text summarization using word sense disambiguation,” Complex & Intelligent Systems, pp. 1–11, 2019.