Logo PTI Logo FedCSIS

Proceedings of the 16th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 25

Enabling Autonomous Medical Image Data Annotation: A human-in-the-loop Reinforcement Learning Approach

, , ,

DOI: http://dx.doi.org/10.15439/2021F86

Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 271279 ()

Full text

Abstract. We introduce a new approach based on Deep Reinforcement Learning to cost-effective annotation in a set of medical data. Our approach consists of a virtual agent to automatically label training data, and a human-in-the-loop to assist in the training of the agent. We implemented the Deep Q-Network algorithm to create the virtual agent and adopted the method mentioned above, which employs human advice to the virtual agent. Our approach was evaluated on a set of medical X-ray data in different use cases, where  the agent was required to create new annotations in the form of bounding boxes from unlabeled data.


  1. G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. Van Der Laak, B. Van Ginneken, and C. I. Sánchez, “A survey on deep learning in medical image analysis,” Medical image analysis, vol. 42, pp. 60–88, 2017. [Online]. Available: https://doi.org/10.1016/j.media.2017.07.005
  2. A. Esteva, K. Chou, S. Yeung, N. Naik, A. Madani, A. Mottaghi, Y. Liu, E. Topol, J. Dean, and R. Socher, “Deep learning-enabled medical computer vision,” NPJ digital medicine, vol. 4, no. 1, pp. 1–9, 2021. [Online]. Available: https://doi.org/10.1038/s41746-020-00376-2
  3. A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,” nature, vol. 542, no. 7639, pp. 115–118, 2017. [Online]. Available: https://doi.org/10.1038/nature21056
  4. J. Yang, J. Fan, Z. Wei, G. Li, T. Liu, and X. Du, “Cost-effective data annotation using game-based crowdsourcing,” Proceedings of the VLDB Endowment, vol. 12, no. 1, pp. 57–70, 2018. [Online]. Available: https://doi.org/10.14778/3275536.3275541
  5. R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018.
  6. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint https://arxiv.org/abs/1312.5602, 2013.
  7. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015. [Online]. Available: https://doi.org/10.1038/nature14236
  8. J. Ibarz, J. Tan, C. Finn, M. Kalakrishnan, P. Pastor, and S. Levine, “How to train your robot with deep reinforcement learning: lessons we have learned,” The International Journal of Robotics Research, vol. 40, no. 4-5, pp. 698–721, 2021. [Online]. Available: https://doi.org/10.1177/0278364920987859
  9. B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. Al Sallab, S. Yogamani, and P. Pérez, “Deep reinforcement learning for autonomous driving: A survey,” IEEE Transactions on Intelligent Transportation Systems, 2021. http://dx.doi.org/10.1109/TITS.2021.3054625
  10. T. Tajmajer, “Modular multi-objective deep reinforcement learning with decision values,” in 2018 Federated conference on computer science and information systems (FedCSIS). IEEE, 2018, pp. 85–93. [Online]. Available: http://dx.doi.org/10.15439/2018F231
  11. L. Sun and Y. Gong, “Active learning for image classification: A deep reinforcement learning approach,” in 2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI). IEEE, 2019. http://dx.doi.org/10.1109/CCHI.2019.8901911 pp. 71–76.
  12. Z. Liu, J. Wang, S. Gong, H. Lu, and D. Tao, “Deep reinforcement active learning for human-in-the-loop person re-identification,” in Proceedings of the IEEE International Conference on Computer Vision, 2019. http://dx.doi.org/10.1109/ICCV.2019.00622 pp. 6122–6131.
  13. V. R. Saripalli, D. Pati, M. Potter, G. Avinash, and C. W. Anderson, “Ai-assisted annotator using reinforcement learning,” SN Computer Science, vol. 1, no. 6, pp. 1–8, 2020. [Online]. Available: https://doi.org/10.1007/s42979-020-00356-z
  14. J. Wang, Y. Yan, Y. Zhang, G. Cao, M. Yang, and M. K. Ng, “Deep reinforcement active learning for medical image classification,” in International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2020, pp. 33–42. [Online]. Available: https://doi.org/10.1007/978-3-030-59710-8_4
  15. J. Shim, S. Kang, and S. Cho, “Active learning of convolutional neural network for cost-effective wafer map pattern classification,” vol. 33, no. 2. IEEE, 2020. http://dx.doi.org/10.1109/TSM.2020.2974867 pp. 258–266.
  16. F.-Q. Liu and Z.-Y. Wang, “Automatic “ground truth” annotation and industrial workpiece dataset generation for deep learning,” International Journal of Automation and Computing, pp. 1–12, 2020.
  17. H. Liang, L. Yang, H. Cheng, W. Tu, and M. Xu, “Human-in-the-loop reinforcement learning,” in 2017 Chinese Automation Congress (CAC), 2017. http://dx.doi.org/10.1109/CAC.2017.8243575 pp. 4511–4518.
  18. L. Torrey and M. Taylor, “Teaching on a budget: Agents advising agents in reinforcement learning,” in Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, 2013, pp. 1053–1060.
  19. Z. Lin, B. Harrison, A. Keech, and M. O. Riedl, “Explore, exploit or listen: Combining human feedback and policy model to speed up deep reinforcement learning in 3d worlds,” arXiv preprint https://arxiv.org/abs/1709.03969, 2017.
  20. S. Krening, “Humans teaching intelligent agents with verbal instruction,” Ph.D. dissertation, Georgia Institute of Technology, 2019.
  21. W. B. Knox and P. Stone, “Tamer: Training an agent manually via evaluative reinforcement,” in 2008 7th IEEE International Conference on Development and Learning. IEEE, 2008, pp. 292–297.
  22. R. Arakawa, S. Kobayashi, Y. Unno, Y. Tsuboi, and S.-i. Maeda, “Dqn-tamer: Human-in-the-loop reinforcement learning with intractable feedback,” arXiv preprint https://arxiv.org/abs/1810.11748, 2018.
  23. G. Li, B. He, R. Gomez, and K. Nakamura, “Interactive reinforcement learning from demonstration and human evaluative feedback,” in 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, 2018. http://dx.doi.org/10.1109/ROMAN.2018.8525837 pp. 1156–1162.
  24. N. Navidi, “Human ai interaction loop training: New approach for interactive reinforcement learning,” arXiv preprint https://arxiv.org/abs/2003.04203, 2020.
  25. T. Mandel, Y.-E. Liu, E. Brunskill, and Z. Popovic, “Where to add actions in human-in-the-loop reinforcement learning.” in AAAI, 2017, pp. 2322–2328.
  26. N. Tajbakhsh, L. Jeyaseelan, Q. Li, J. N. Chiang, Z. Wu, and X. Ding, “Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation,” Medical Image Analysis, p. 101693, 2020. [Online]. Available: https://doi.org/10.1016/j.media.2020.101693
  27. J. C. Caicedo and S. Lazebnik, “Active object localization with deep reinforcement learning,” in Proceedings of the IEEE international conference on computer vision, 2015. http://dx.doi.org/10.1109/ICCV.2015.286 pp. 2488–2496.
  28. M. Otoofi, “Object localization using deep reinforcement learning Mohammad Otoofi,” Master’s thesis, University of Glasgow, Scotland, 2018.
  29. D. L. Poole and A. K. Mackworth, Artificial Intelligence: foundations of computational agents. Cambridge University Press, 2010.
  30. R. Padilla, S. L. Netto, and E. A. da Silva, “A survey on performance metrics for object-detection algorithms,” in 2020 International Conference on Systems, Signals and Image Processing (IWSSIP). IEEE, 2020. http://dx.doi.org/10.1109/IWSSIP48289.2020.9145130 pp. 237–242.
  31. H. Amin and W. J. Siddiqui, “Cardiomegaly,” StatPearls [internet], 2020.
  32. K. Monowar, “National institutes of health chest x-ray dataset,” May 2020. [Online]. Available: https://www.kaggle.com/khanfashee/nih-chest-x-ray-14-224x224-resized
  33. C. Semsarian, J. Ingles, M. S. Maron, and B. J. Maron, “New perspectives on the prevalence of hypertrophic cardiomyopathy,” Journal of the American College of Cardiology, vol. 65, no. 12, pp. 1249–1254, 2015. http://dx.doi.org/10.1016/j.jacc.2015.01.019
  34. B. J. Maron, J. M. Gardin, J. M. Flack, S. S. Gidding, T. T. Kurosaki, and D. E. Bild, “Prevalence of hypertrophic cardiomyopathy in a general population of young adults: echocardiographic analysis of 4111 subjects in the cardia study,” Circulation, vol. 92, no. 4, pp. 785–789, 1995. http://dx.doi.org/10.1161/01.cir.92.4.785
  35. M. L. Kwan, L. H. Kushi, E. Weltzien, B. Maring, S. E. Kutner, R. S. Fulton, M. M. Lee, C. B. Ambrosone, and B. J. Caan, “Epidemiology of breast cancer subtypes in two prospective cohort studies of breast cancer survivors,” Breast Cancer Research, vol. 11, no. 3, p. R31, 2009. http://dx.doi.org/10.1186/bcr2261
  36. M. Moghbel, C. Y. Ooi, N. Ismail, Y. W. Hau, and N. Memari, “A review of breast boundary and pectoral muscle segmentation methods in computer-aided detection/diagnosis of breast mammography,” Artificial Intelligence Review, pp. 1–46, 2019. [Online]. Available: https://doi.org/10.1007/s10462-019-09721-8
  37. V. Gupta, C. Taylor, S. Bonnet, L. M. Prevedello, J. Hawley, R. D. White, M. G. Flores, and B. S. Erdal, “Deep learning-based automatic detection of poorly positioned mammograms to minimize patient return visits for repeat imaging: A real-world application,” arXiv preprint https://arxiv.org/abs/2009.13580, 2020.