Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 19

Position Papers of the 2019 Federated Conference on Computer Science and Information Systems

Integrating Computer Vision and Natural Language Processing to Guide Blind Movements

DOI: http://dx.doi.org/10.15439/2019F345

Citation: Position Papers of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 19, pages 9196 ()

Full text

Abstract. Vision is the most essential sense for humans. Vision impairment is one of the most problems faced by the elderly. Blindness is a state of lacking the visual perception due to physiological or neurological factors. This paper presents a detailed systematic and critical review that explores the available literature and outlines the research efforts that have been made in relation to movements of the blind and proposes an integrated guidance system involving computer vision and natural language processing. An advanced Smartphone equipped with language intelligence capabilities is attached to the blind person to capture surrounding images and is then connected to a central server equipped with a faster region convolutional neural network algorithm image detection algorithm to recognize images and multiple obstacles. The server sends the results back to the Smartphone which are then converted into speech for the blind person's guidance.


  1. A. S .Al-Fahoum., H. B .Al-Hmoud., and A. A.Al-Fraihat, “A smart in-frared microcontroller-based blind guidance system”, Active and Passive, Electronic Components, vol. 2013
  2. V. Adagale and S. Mahajan,” Route Guidance System for Blind People Using GPS” and GSM. IJEETC ,4,16–21, 2015
  3. R. R. A. Bourne , S. R. Flaxman, and T. Braithwaite “Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis”. Lancet Glob Health. 5: e888-e897, 2018
  4. S. Chaurasi and V.N. Kavitha, “An Electronic Walking Stick for Blinds, in Information Communication and Embedded Systems” (ICICES), 2014 International Conference on. IEEE, 2014, pp. 1–5
  5. V. Filipe, F. Fernandes, H. Fernandes, A. Sousa, H. Paredes, J. Barroso, “Blind navigation support ystem based on Microsoft Kinect. In Conf. Proceedings of the 2012 International Conference on Software Development for Enhancing Accessibility and Fighting Info-Exclusion” (DSAI), Douro, Portugal, pp. 94–101. 2012
  6. T. R. Fricke, N. Tahhan, E. Resnikoff,, A. Burnett, S. M. Ho, T. Naduvilath, K. S. Naidoo. “Global Prevalence of Presbyopia and Vision Impairment from Uncorrected Presbyopia: Systematic Review, Meta-analysis, and Modelling. Ophthalmology”. 2018 Oct; 125(10):1492-1499. http://dx.doi.org/10.1016/j.ophtha.2018.04.013
  7. V. N Hoang,T H. Nguyen, T. L. Le, T. H Tran,T. P Vuong, N. Vuillerme, “Obstacle detection and warning system for visually impaired people based on electrode matrix and mobile Kinect”. Vietnam J. Comput. Sci.,4, 71–83. 2017
  8. H. C. Huang, C. T. Hsieh, C. H Yeh, “An Indoor Obstacle Detection System Using Depth Information and Region Growth. Sensors” 2015, 15, 27116–27141.
  9. S. L Joseph., J. Xiao., X. Zhang., B. Chawda, K. Narang., N. Rajput., S. Mehta., L.V. Subramaniam, “Being Aware of the World: Toward Using Social Media to Support the Blind with Navigation”. IEEE Trans. Hum. Mach. Syst. 45, 399–405. 2015
  10. A. M Kassima., T. Yasunoa., M. S. M. Arasb., A. Z Shukorb, H. I. Jaafarb., M. F. Baharomb., F. A. Jafarb,. “Vision Based of Tactile Paving Detection in Navigation System for Blind Person”. J. Teknol. (Sci. Eng.) , 77, 25–32. 2015
  11. S. Mann, J. Huang, R. Janzen, R. Lo, R. Ramoersadm, V. Chen, A. Doha, “Blind Navigation with a Wearable Range Camera and Vibrotactile Helmet”. In Conf. Proceedings of the 19th ACM International Conference on Multimedia, Scottsdale, AZ, USA, pp. 1325–1328. 2011
  12. J. Malik, A. Arbeláez, C. Joao, F. Katerina., G. Ross, G. Georgia, S. Gupta, H. Bharath, A. Kar., and T. Shubhami. “The three Rs of computer vision: Recognition, reconstruction and reorganization” Pattern Recogn. Lett. 72 , 4–14. 2016
  13. A. Pereira., N. Nunesa., D. Vieiraa., N. Costaa., H. Fernandesc., J. Barroso, “Blind Guide: An ultrasound sensor-based body area network for guiding blind people.” Procedia Comput. Sci., 67, 403–408. 2015
  14. M. Saunders., P. Lewis and A. Thornhill, “Research Methods for Business Students”. Pearson Education Ltd., Harlow. 2012
  15. R. Socher , C.D. Manning, and A.Y. Ng, “ Parsing natural scenes and natural language with recursive neural networks.” In Conf. 28th International Conference on Machine Learning (ICML-11). 129–136. 2011
  16. C. Szegedy, L. Wei , J. Yangqing, S. Pierre, R. Scott., A. Dragomir., E. Dumitru, V. Vincent., R. Andrew “Going deeper with Convolutions.” in IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
  17. B. Vauquois, “A Survey of Formal Grammars and Algorithms for Recognition and Transformation in Mechanical Translation.” In Conf. Proceedings of IFIP Congress, 1114–1122. Edinburgh. 1968
  18. M. R Walter, M. Antone, E. Chuangsuwanich, A. Correa, R. Davi, L. Fletcher, E. Frazzoli, Y. Friedman, H. P. Jonathan, H. Jeong, S. Karaman, B. Luders, J. R. Glass, “A situationally aware voice-commandable robotic forklift working alongside people in unstructured outdoor environments.” J.Field Robot. 32, 4, 590–628. 2015. http://dx.doi.org/ 10.1002/rob.21539
  19. J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, Kate Saenko, and T. Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2625–2634.
  20. H. Yu, N. Siddharth, A. Barbu, and J. M. Siskind. A compositional framework for grounding language inference, generation, and acquisition in video. J. Artif. Intell. Res. (2015), 601–713.
  21. H. Xu, S. Venugopalan, V. Ramanishka, M. Rohrbach, and K. Saenko. 20. A multi-scale multiple instance video description network. arXiv preprint https://arxiv.org/abs/1505.05914 (2015).
  22. S. Venugopalan, H. Xu, J. D.Marcus Rohrbach, R. Mooney, and K. Saenko. Translating videos to natural language using deep recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015