Integrating Computer Vision and Natural Language Processing to Guide Blind Movements

Lenard Nkalubo

Integrating Computer Vision and Natural Language Processing to Guide Blind Movements

Lenard Nkalubo

DOI: http://dx.doi.org/10.15439/2019F345

Citation: Position Papers of the 2019 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 19, pages 91–96 (2019)

Full text

Abstract. Vision is the most essential sense for humans. Vision impairment is one of the most problems faced by the elderly. Blindness is a state of lacking the visual perception due to physiological or neurological factors. This paper presents a detailed systematic and critical review that explores the available literature and outlines the research efforts that have been made in relation to movements of the blind and proposes an integrated guidance system involving computer vision and natural language processing. An advanced Smartphone equipped with language intelligence capabilities is attached to the blind person to capture surrounding images and is then connected to a central server equipped with a faster region convolutional neural network algorithm image detection algorithm to recognize images and multiple obstacles. The server sends the results back to the Smartphone which are then converted into speech for the blind person's guidance.

References

A. S .Al-Fahoum., H. B .Al-Hmoud., and A. A.Al-Fraihat, “A smart in-frared microcontroller-based blind guidance system”, Active and Passive, Electronic Components, vol. 2013
V. Adagale and S. Mahajan,” Route Guidance System for Blind People Using GPS” and GSM. IJEETC ,4,16–21, 2015
R. R. A. Bourne , S. R. Flaxman, and T. Braithwaite “Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis”. Lancet Glob Health. 5: e888-e897, 2018
S. Chaurasi and V.N. Kavitha, “An Electronic Walking Stick for Blinds, in Information Communication and Embedded Systems” (ICICES), 2014 International Conference on. IEEE, 2014, pp. 1–5
V. Filipe, F. Fernandes, H. Fernandes, A. Sousa, H. Paredes, J. Barroso, “Blind navigation support ystem based on Microsoft Kinect. In Conf. Proceedings of the 2012 International Conference on Software Development for Enhancing Accessibility and Fighting Info-Exclusion” (DSAI), Douro, Portugal, pp. 94–101. 2012
T. R. Fricke, N. Tahhan, E. Resnikoff,, A. Burnett, S. M. Ho, T. Naduvilath, K. S. Naidoo. “Global Prevalence of Presbyopia and Vision Impairment from Uncorrected Presbyopia: Systematic Review, Meta-analysis, and Modelling. Ophthalmology”. 2018 Oct; 125(10):1492-1499. http://dx.doi.org/10.1016/j.ophtha.2018.04.013
V. N Hoang,T H. Nguyen, T. L. Le, T. H Tran,T. P Vuong, N. Vuillerme, “Obstacle detection and warning system for visually impaired people based on electrode matrix and mobile Kinect”. Vietnam J. Comput. Sci.,4, 71–83. 2017
H. C. Huang, C. T. Hsieh, C. H Yeh, “An Indoor Obstacle Detection System Using Depth Information and Region Growth. Sensors” 2015, 15, 27116–27141.
S. L Joseph., J. Xiao., X. Zhang., B. Chawda, K. Narang., N. Rajput., S. Mehta., L.V. Subramaniam, “Being Aware of the World: Toward Using Social Media to Support the Blind with Navigation”. IEEE Trans. Hum. Mach. Syst. 45, 399–405. 2015
A. M Kassima., T. Yasunoa., M. S. M. Arasb., A. Z Shukorb, H. I. Jaafarb., M. F. Baharomb., F. A. Jafarb,. “Vision Based of Tactile Paving Detection in Navigation System for Blind Person”. J. Teknol. (Sci. Eng.) , 77, 25–32. 2015
S. Mann, J. Huang, R. Janzen, R. Lo, R. Ramoersadm, V. Chen, A. Doha, “Blind Navigation with a Wearable Range Camera and Vibrotactile Helmet”. In Conf. Proceedings of the 19th ACM International Conference on Multimedia, Scottsdale, AZ, USA, pp. 1325–1328. 2011
J. Malik, A. Arbeláez, C. Joao, F. Katerina., G. Ross, G. Georgia, S. Gupta, H. Bharath, A. Kar., and T. Shubhami. “The three Rs of computer vision: Recognition, reconstruction and reorganization” Pattern Recogn. Lett. 72 , 4–14. 2016
A. Pereira., N. Nunesa., D. Vieiraa., N. Costaa., H. Fernandesc., J. Barroso, “Blind Guide: An ultrasound sensor-based body area network for guiding blind people.” Procedia Comput. Sci., 67, 403–408. 2015
M. Saunders., P. Lewis and A. Thornhill, “Research Methods for Business Students”. Pearson Education Ltd., Harlow. 2012
R. Socher , C.D. Manning, and A.Y. Ng, “ Parsing natural scenes and natural language with recursive neural networks.” In Conf. 28th International Conference on Machine Learning (ICML-11). 129–136. 2011
C. Szegedy, L. Wei , J. Yangqing, S. Pierre, R. Scott., A. Dragomir., E. Dumitru, V. Vincent., R. Andrew “Going deeper with Convolutions.” in IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
B. Vauquois, “A Survey of Formal Grammars and Algorithms for Recognition and Transformation in Mechanical Translation.” In Conf. Proceedings of IFIP Congress, 1114–1122. Edinburgh. 1968
M. R Walter, M. Antone, E. Chuangsuwanich, A. Correa, R. Davi, L. Fletcher, E. Frazzoli, Y. Friedman, H. P. Jonathan, H. Jeong, S. Karaman, B. Luders, J. R. Glass, “A situationally aware voice-commandable robotic forklift working alongside people in unstructured outdoor environments.” J.Field Robot. 32, 4, 590–628. 2015. http://dx.doi.org/ 10.1002/rob.21539
J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, Kate Saenko, and T. Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2625–2634.
H. Yu, N. Siddharth, A. Barbu, and J. M. Siskind. A compositional framework for grounding language inference, generation, and acquisition in video. J. Artif. Intell. Res. (2015), 601–713.
H. Xu, S. Venugopalan, V. Ramanishka, M. Rohrbach, and K. Saenko. 20. A multi-scale multiple instance video description network. arXiv preprint https://arxiv.org/abs/1505.05914 (2015).
S. Venugopalan, H. Xu, J. D.Marcus Rohrbach, R. Mooney, and K. Saenko. Translating videos to natural language using deep recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015