Logo PTI
Polish Information Processing Society
Logo FedCSIS

Annals of Computer Science and Information Systems, Volume 11

Proceedings of the 2017 Federated Conference on Computer Science and Information Systems

A view on the methodology of analysis and exploration of marketing data


DOI: http://dx.doi.org/10.15439/2017F442

Citation: Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 11, pages 11351143 ()

Full text

Abstract. The paper proposes a methodology for the development of a marketing decision support system using Big Data technology and data mining techniques. The approach was inspired by the CRISP-DM methodology, which is not oriented towards Big Data projects. Therefore, we have modified this methodology with respect to the purpose and technological requirements of the project. The proposed methodology was tested during development of RTOM (Real Time Omnichannel Marketing) project. Project tasks focus on the analysis and exploration of large and heterogeneous data sets. The paper presents the phases of the project implementation according to the extended CRISP-DM methodology, taking into account the specifics of the analysis and exploration processes of large realtime marketing databases. Examples of project steps are also provided to illustrate the approach.


  1. Witten I., Frank E., Hall M., Pal C., (2017) Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufman
  2. Shmueli G., Bruce P., Stephens M, Patel N.,(2017), Data Mining for Business Analytics, Wiley
  3. Piatetsky-Shapiro G., (2014), KDnuggets Methodology Poll
  4. Shearer C., (2000), The CRISP-DM model: the new blueprint for data mining, J Data Warehousing; 5, pp. 13-22
  5. Azevedo, A. and Santos, M. F., (2008), KDD, SEMMA and CRISP-DM: A parallel overview [In] Proceedings of the IADIS European Conference on Data Mining, pp. 182-185
  6. Frazer, M., Stiehler, B. E. (2014,). Omnichannel retailing: The merging of the online and offline environment. In Proceedings of the Global Conference on Business and Finance (Vol. 9, No. 1, pp. 655-657).
  7. IBM (2011), Introducing Apache Mahout". ibm.com. 2011
  8. Rigby, D., (2011), The Future of Shopping. Harvard Business Review, December 2011.
  9. Karau H., Konwinski A., Wendell P., Zaharia M., (2015), Learning Spark: Lightning-Fast Big Data Analysis, O’Reilly
  10. Marz N., Warren J., (2015), Big Data: Principles and best practices of scalable realtime data systems, Manning Publ.
  11. Chorianopoulos, A. (2016), Effective CRM using predictive analytics. John Wiley & Sons.
  12. http://lambda-architecture.net/
  13. https://www.oreilly.com/ideas/questioning-the-lambda-architecture
  14. White T., (2015), Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale, O’Reilly
  15. Ryza S., Laserson U., Owen S., Wills J., (2015), Advanced Analytics with Spark: Patterns for Learning from Data at Scale, O’Reilly,
  16. Laserson U., Owen S., Wills J., (2015), Analytics with Spark: Patterns for Learning from Data at Scale, O’Reilly
  17. Owen S., Anik R., Dunning T., Friedman E., (2012), Mahout in Action, Manning Publ.
  18. Shmueli G., Patel N., Bruce P., (2010), Data Mining for Business Intelligence, Wiley