Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 9

Position Papers of the 2016 Federated Conference on Computer Science and Information Systems

Machine Vision in Food Recognition: Attempts to Enhance CBVIR Tools

DOI: http://dx.doi.org/10.15439/2016F579

Citation: Position Papers of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 9, pages 5761 ()

Full text

Abstract. Visual identification of complex images (e.g. images of food) remains a challenging problem. In particular, content-based visual information retrieval (CBVIR) methods, which seem a natural choice for such tasks, are often constrained by specific characteristics of the images of interest and (possibly) other practical requirements. In this paper, a novel CBVIR approach to automatic food identification is proposed, taking into account characteristics of solutions currently existing in this area. Based on limitations of those solutions, we present a scheme in which a co-occurrence of MSER features extracted from three color channels is employed to build a \textit{bag-of-words} histogram. Subsequently, food images are matched by detecting similarities between those histograms. Preliminary tests on a recently published benchmark dataset UNICT-FD889 reveal certain advantages of the scheme and highlight its limitations. In particular, a need of a novel methodology for segmentation of food images has been identified.


  1. G. M. Farinella, D. Allegra, and F. Stanco, “A benchmark dataset to study the representation of food images,” in Proc. ECCV 2014 Workshops, vol. III, 2015, pp. 584–599. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-16199-0_41
  2. F. Kong and J. Tan, “Dietcam: Automatic dietary assessment with mobile camera phones,” Pervasive and Mobile Computing, vol. 8, no. 1, pp. 147–163, 2012. [Online]. Available: http://dx.doi.org/10.1016/j.pmcj.2011.07.003
  3. Y. Matsuda, H. Hoashi, and K. Yanai, “Recognition of multiple-food images by detecting candidate regions,” in Proc. IEEE Int.Conf. on Multimedia and Expo, 2012, pp. 25–30. [Online]. Available: http://dx.doi.org/10.1109/ICME.2012.157
  4. H. Hoashi, T. Joutou, and K. Yanai, “Image recognition of 85 food categories by feature fusion,” in Proc. IEEE Int. Symposium on Multimedia, 2010, pp. 296–301. [Online]. Available: http://dx.doi.org/10.1109/ISM.2010.51
  5. S. Yang, M. Chen, D. Pomerleau, and R. Sukthankar, “Food recognition using statistics of pairwise local features,” in Proc. IEEE Conf. CVPR 2010, 2010, pp. 2249–2256. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2010.5539907
  6. Z. Zong, D. Nguyen, P. Ogunbona, and W. Li, “On the combination of local texture and global structure for food classification,” in Proc. IEEE Int. Symposium on Multimedia, 2010, pp. 204–211. [Online]. Available: http://dx.doi.org/10.1109/ISM.2010.37
  7. G. O’Loughlin, S. Cullen, A. McGoldrick, S. O’Connor, R. Blain, S. O’Malley, and G. Warrington, “Using a wearable camera to increase the accuracy of dietary analysis,” American Journal of Preventive Medicine, vol. 44, no. 3, pp. 297–301, 2013. [Online]. Available: http://dx.doi.org/10.1016/j.amepre.2012.11.007
  8. F. Zhu, M. Bosch, I. Woo, S. Kim, C. Boushey, D. Ebert, and E. Delp, “The use of mobile devices in aiding dietary assessment and evaluation,” Journal of Selected Topics in Signal Processing, vol. 4, no. 4, pp. 756–766, 2010. [Online]. Available: http://dx.doi.org/10.1109/JSTSP.2010.2051471
  9. M. Chen, K. Dhingra, W. Wu, L. Yang, R. Sukthankar, and J. Yang, “Pfid: Pittsburgh fast-food image dataset,” in Proc. IEEE Conf. ICIP 2009, 2009, pp. 289–292. [Online]. Available: http://dx.doi.org/10.1109/ICIP.2009.5413511
  10. A. Jimenez, A. Jain, R. Ruz, and J. Rovira, “Automatic fruit recognition: a survey and new results using range/attenuation images,” Pattern Recognition, vol. 32, no. 10, pp. 1719–1739, 1999. [Online]. Available: http://dx.doi.org/10.1016/S0031-3203(98)00170-8
  11. F. Pla, “Recognition of partial circular shapes from segmented contours,” Comput. Vision & Image Understanding, vol. 63, no. 2, pp. 334–343, 1996. [Online]. Available: http://dx.doi.org/10.1006/cviu.1996.0023
  12. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004. [Online]. Available: http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94
  13. J. Sivic and A. Zisserman, “Video google: A text retrieval approach to object matching in videos,” in Proc. 9th IEEE Conf. ICCV 2003, vol. 2, Nice, 2003, pp. 1470–1477. [Online]. Available: http://dx.doi.org/10.1109/ICCV.2003.1238663
  14. T. Ojala, M. Pietikainen, and D. Harwood, “A comparative study of texture measures with classification based on feature distributions,” Pattern Recognition, vol. 29, no. 1, pp. 51–59, 1996. [Online]. Available: http://dx.doi.org/10.1016/0031-3203(95)00067-4
  15. M. Varma and A. Zisserman, “A statistical approach to texture classification from single images,” International Journal of Computer Vision, vol. 62, no. 1-2, pp. 61–81, 2005. [Online]. Available: http://dx.doi.org/10.1007/s11263-005-4635-4
  16. X. Qi, R. Xiao, J. Guo, and L. Zhang, “Pairwise rotation invariant co-occurrence local binary pattern,” in Proc. ECCV 2012, 2012, pp. 158–171. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-33783-3_12
  17. J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide baseline stereo from maximally stable extremal regions,” Image and Vision Computing, vol. 22, pp. 761–767, 2004. [Online]. Available: http://dx.doi.org/10.1016/j.imavis.2004.02.006
  18. C. Schmid and R. Mohr, “Local grayvalue invariants for image retrieval,” IEEE Trans PAMI, vol. 19, no. 5, pp. 530–535, 1997. [Online]. Available: http://dx.doi.org/10.1109/34.589215
  19. Z. Wu, Q. Ke, M. Isard, and J. Sun, “Bundling features for large scale partial-duplicate web image search,” in Proc. IEEE Conf. CVPR 2009, 2009, pp. 25–32. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2009.5206566
  20. A. Śluzek, “Extended keypoint description and the corresponding improvements in image retrieval,” LNCS (Revised Selected Papers of ACCV 2014 Workshops), vol. 9008, pp. 698–707, 2015. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-16628-5 50
  21. R. Arandjelovic and A. Zisserman, “Three things everyone should know to improve object retrieval,” in Proc. IEEE Conf. CVPR 2012, 2012, pp. 2911–2918. [Online]. Available: http://dx.doi.org/10.1109/CVPR.2012.6248018
  22. S.-H. Cha and S. Srihari, “On measuring the distance between histograms,” Pattern Recognition, vol. 35, pp. 1355–1370, 2002. [Online]. Available: http://dx.doi.org/10.1016/S0031-3203(01)00118-2
  23. M. Swain and D. Ballard, “Color indexing,” International Journal of Computer Vision, vol. 7, no. 1, pp. 11–32, 1991. [Online]. Available: http://dx.doi.org/10.1007/BF00130487
  24. A. Śluzek and M. Paradowski, “Reinforcement of keypoint matching by co-segmentation in object retrieval: Face recognition case study,” LNCS (Proc. ICONIP 2012), vol. 7667, pp. 34–41, 2012. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-34500-5_5