Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 11

Proceedings of the 2017 Federated Conference on Computer Science and Information Systems

Robust face model based approach to head pose estimation


DOI: http://dx.doi.org/10.15439/2017F425

Citation: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 11, pages 12911295 ()

Full text

Abstract. Head pose estimation from camera images is a~computational problem that may influence many sociological, cognitive, interaction and marketing researches. It is especially crucial in the process of visual gaze estimation which accuracy depends not only on eye region analysis, but head inferring as well. Presented method exploits a 3d head model for a user head pose estimation as it outperforms, in the context of performance, popular appearance based approaches and assures efficient face head pose analysis. The novelty of the presented approach lies in a default head model refinement according to the selected facial features localisation. The new method not only achieves very high precision (about 4\degree), but iteratively improves the reference head model. The results of the head pose inferring experiments were verified with professional Vicon motion tracking system and head model refinement accuracy was verified with high precision Artec structural light scanner.


  1. A. Wojciechowski, and K. Fornalczyk, “Single web camera robust interactive eye-gaze tracking method“, Bulletin of the Polish Academy of Sciences, vol. 63 no.4, pp. 879, 2015.
  2. S. Langton, H. Honeyman, and E. Tessler, “The influence of head contour and nose angle on the perception of eye-gaze direction“, Perception and Psychophysics, vol. 66, no. 5, pp. 752-771, 2004.
  3. E. Murphy-Chutorian, and M. M. Trivedi, “Head pose estimation in computer vision: A survey“, IEEE transactions on pattern analysis andmachine intelligence vol. 31 no.4, pp. 607-626, 2009.
  4. J. M. Rehg, G. D. Abowd, A. Rozga, M. Romero, M. A. Clements, S. Sclaroff, I. Essa, O. Y. Ousley, Y. Li, K. Chanho, H. Rao, J. C. Kim, L. L. Presti, J. Zhang, D. Lantsman, J. Bidwell, and Z. Ye, “Decoding Children’s Social Behavior“, Computer Vision and Pattern Recognition (CVPR), pp. 3414-3421, 2013.
  5. P. Kucharski, P. Łuczak, I. Perenc, T. Jaworski, A. Romanowski, M. Obaid and P. W. Woźniak, “APEOW: A personal persuasive avatar for encouraging breaks in office work“, Proc. of the 2016 FedCSIS Conf., Eds. M. Ganzha, L. Maciaszek and M. Paprzycki, IEEE, ACSIS, Vol. 8,pages 1627-1630, 2016.
  6. D. Rozado, A. El. Shoghri, and R. Jurdak, “Gaze dependant prefetching of web content to increase speed and comfort of web browsing“, Int. J. of Human-Computer Studies vol. 78, pp. 31-42, 2015.
  7. C. Chen, P. Wozniak, A. Romanowski, M. Obaid, T. Jaworski„ J. Kucharski, K. Grudzień, S. Zhao, M. Fjeld, “Using Crowdsourcing for Scientific Analysis of Industrial Tomographic Images“, ACM Trans. on Intel. Syst. and Tech., Vol. 7 Issue 4, art no. 52, 25p., 2016.
  8. I. Jelliti, A. Romanowski, K. Grudzień, “Design of Crowdsourcing System for Analysis of Gravitational Flow using X-ray Visualization“, Proc. of the 2016 FedCSIS Conf., Eds. M. Ganzha, L. Maciaszek and M. Paprzycki, IEEE, ACSIS, Vol. 8, pages 1613-1619, 2016.
  9. Q. Zhao, and Ch. Koch, “Learning saliency-based visual attention: A review“. Signal Processing, vol. 93 no. 6, pp. 1401-1407, 2013.
  10. H. Wilson, F. Wilkinson, L. Lin, and M. Castillo, “Perception of head orientation“, Vision Research, vol. 40, no. 5, pp. 459-472, 2000.
  11. M. Kowalski, and W. Skarbek, “Online 3D face reconstruction with incremental Structure From Motion and a regressor cascade“, Symp. on Photonics Applications in Astronomy, Communications, Industry and High-Energy Physics Experiments. Int. Soc. for Opt. and Phot., 2014.
  12. A. Gee, and R. Cipolla, “Determining the gaze of faces in images“, Image and Vision Computing, vol. 12, no. 10, pp.639-647, 1994.
  13. T. Horprasert, Y. Yacoob, and L. Davis, “Computing 3-d head orientation from a monocular image sequence“, Proc. Int. Conf. Automatic Face and Gesture Recognition, pp. 242-247, 1996.
  14. V. Kazemi, and J. Sullivan, “One millisecond face alignment with an ensemble of regression trees“, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867-1874, 2014.
  15. Dlib C++ Library., http://dlib.net/
  16. M. Fischler, and R. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography“, Comm. of the ACM, vol. 24 no. 6, pp. 381-395, 1981.
  17. J. G. Wang, and E. Sung, (2007). “EM enhancement of 3D head pose estimated by point at infinity“, Image and Vision Computing, vol. 25 no. 12, 1864-1874.
  18. A. Asthana, S. Zafeiriou, S. Cheng, and M. Pantic, “Robust discriminative response map fitting with constrained local models“, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3444-3451, 2013.
  19. R. Hartley, and A. Zisserman, “Multiple view geometry in computer vision“, 2nd edition, Cambridge Univ. Press, 2004.
  20. Static adult human physical characteristics of the head., https://en.wikipedia.org/wiki/Human_head#/media/File:HeadAnthropometry.JPG
  21. A head-and-face anthropometric survey of U.S. respirator users., https://www.nap.edu/resource/11815/Anthrotech_report.pdf
  22. Artec Eva laser scanner., https://www.artec3d.com/3d-scanner/artec-eva
  23. T. Baltrusaitis, P. Robinson, L. P. Morency, “Openface: an open source facial behavior analysis toolkit“, App. of Comp. Vision, p. 1-10, 2016.
  24. T. Baltrusaitis, P. Robinson, L. P. Morency, “Constrained local neural fields for robust facial landmark detection in the wild“, Proc. of the IEEE Int. Conf. on Comp. Vision Work., p. 354-361, 2013.
  25. L. P. Morency, J. Whitehill, and J. Movellan, “Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation“, Automatic Face and Gesture Recognition, 8th IEEE International Conference on. IEEE, p. 1-8, 2008.
  26. N. Wang, X. Gao, D. Tao, and X. Li. “Facial feature point detection: A comprehensive survey“, CoRR, 2014.
  27. T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graha, “Active shape models-their training and application“, Computer vision and image understanding, vol. 61 no. 1, pp. 38-59, 1995.
  28. G. J. Edwards, Ch. J. Taylor and T.F. Cootes, “Interpreting face images using active appearance models“, Automatic Face and Gesture Recognition, Proc. Third IEEE Int. Conf. on. IEEE, pp. 300-305, 1998.
  29. R. Staniucha, and A. Wojciechowski, “Mouth features extraction for emotion classification“, Computer Science and Information Systems (FedCSIS), 2016 Federated Conference on. IEEE, pp. 1685-1692, 2016.
  30. K. A. Funes, “3D Gaze Estimation from Remote RGB-D Sensors“, PhD Thesis, Ecole Polytechnique Federale de Lausanne, 2015.
  31. M. Kowalczyk, and P. Napieralski, “An Effective client-side object detection method on the Android platform“, Journal of Applied Computer Science, vol. 23, pp. 29-38, 2015.
  32. X. Xiong, and F. Torre, “Supervised Descent Method and its Applications to Face Alignment“, Comp. Vision and Pattern Rec., 2013.
  33. X. Cao, Y. Wei, F. Wen and J. Sun, “Face Alignment by Explicit Shape Regression”, International Journal of Computer Vision, vol. 107, pp. 177-190, 2014.