Goliath, a Programming Exercises Generator Supported by AI
Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, Pedro Rangel Henriques
DOI: http://dx.doi.org/10.15439/2024F8479
Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 331–342 (2024)
Abstract. The teaching-learning process is complex in nature, requiring many tasks and skills to achieve success in the construction of knowledge. As per any particular kind of cognitive development, teaching and learning Computer Programming is no different in this regard: tasks must be executed, sometimes repeatedly, and skills must be developed. Despite different approaches and methodologies, exercising what has been studied is proven to be effective in pretty much any teaching-learning process. Many tools have been developed throughout time to aid in the execution of this important task, sometimes approaching the problem from the students' perspective, sometimes from the teachers'. This paper presents Goliath, a semi-automatic generator of Computer Programming exercises, whose functionality is based on Artificial Intelligence (AI) models, a Domain-Specific Language (DSL), and an online application that binds them together. Goliath's goals are directed towards teachers (and indirectly, students) by aiming to lower the burden of repeatedly constructing exercises. This is achieved through the use of templates that allow for automatic variations of an exercise to be created instantly, while relying on a common foundation. Goliath is meant to be a facilitator, raising availability of exercise lists, while avoiding repetition and the common mistakes that accompany their construction.
References
- A. Gomes and A. J. Mendes, “Learning to program: Difficulties and solutions,” Proceedings of the 2007 International Conference on Engineering and Education (ICEE). International Network on Engineering Education and Research, 2007, pp. 283–287. [Online]. Available: http://icee2007.dei.uc.pt/proceedings/papers/411.pdf
- J. Figueiredo and F. J. García-Peñalvo, “Building skills in introductory programming,” F. J. García-Peñalvo, Ed., Proceedings of the Sixth International Conference on Technological Ecosystems for Enhancing Multiculturality. New York: ACM, 10 2018. http://dx.doi.org/10.1145/3284179. ISBN 9781450365185 p. 46–50. [Online]. Available: https://dl.acm.org/doi/10.1145/3284179.3284190
- M. J. V. Pereira and P. R. Henriques, “Visualization/animation of programs in alma: Obtaining different results,” in Proceedings of the IEEE Symposium on Human Centric Computing Languages and Environments, 2003. http://dx.doi.org/10.1109/HCC.2003.1260242 pp. 260–262. [Online]. Available: https://ieeexplore.ieee.org/document/1260242
- R. R. Fenichel, J. Weizenbaum, and J. C. Yochelson, “A program to teach programming,” Communications of the ACM, vol. 13, pp. 141–146, 03 1970. http://dx.doi.org/10.1145/362052.362053. [Online]. Available: https://dl.acm.org/doi/10.1145/362052.362053
- S. A. Robertson and M. P. Lee, “The application of second natural language acquisition pedagogy to the teaching of programming languages: a research agenda,” ACM SIGCSE Bulletin, vol. 27, no. 4, p. 9–12, 12 1995. http://dx.doi.org/10.1145/216511. [Online]. Available: https://dl.acm.org/doi/10.1145/216511.216517
- M. V. P. Almeida, L. M. Alves, M. J. V. Pereira, and G. A. R. Barbosa, “Easycoding: Methodology to support programming learning,” R. Queirós, F. Portela, M. Pinto, and A. Simões, Eds., vol. 81, Open Access Series in Informatics (OASIcs). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 06 2020. http://dx.doi.org/10.4230/OASIcs.ICPEC.2020.1. ISBN 978-3-95977-153-5. ISSN 2190-6807 pp. 1–8. [Online]. Available: https://drops.dagstuhl.de/opus/volltexte/2020/12288
- Python Tutor, “Python tutor.” [Online]. Available: https://pythontutor.com
- beecrowd, “beecrowd,” 2024. [Online]. Available: https://beecrowd.com
- A. Rios, J. L. Pérez de la Cruz, and R. Conejo, “Siette: Intelligent evaluation system using tests for teleeducation,” in WWW-Based Tutoring Workshop at 4th International Conference on Intelligent Tutoring Systems, 1998. [Online]. Available: https://www.siette.org
- A. Zeileis, “R/exams.” [Online]. Available: https://www.r-exams.org
- M. Bower, “A taxonomy of task types in computing,” SIGCSE Bull., vol. 40, no. 3, p. 281–285, jun 2008. http://dx.doi.org/10.1145/1597849.1384346. [Online]. Available: https://doi.org/10.1145/1597849.1384346
- A. Ruf, M. Berges, and P. Hubwieser, “Classification of programming tasks according to required skills and knowledge representation,” vol. 9378, 09 2015. http://dx.doi.org/10.1007/978-3-319-25396-1_6. ISBN 978-3-319-25395-4
- N. Ragonis, “Type of questions - the case of computer science,” Olympiads in Informatics, vol. 6, pp. 115–132, 01 2012.
- A. Simões and R. Queirós, “On the Nature of Programming Exercises,” in First International Computer Programming Education Conference (ICPEC 2020), ser. OpenAccess Series in Informatics (OASIcs), R. Queirós, F. Portela, M. Pinto, and A. Simões, Eds., vol. 81. Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2020. http://dx.doi.org/10.4230/OASIcs.ICPEC.2020.24. ISBN 978-3-95977-153-5. ISSN 2190-6807 pp. 24:1–24:9. [Online]. Available: https://drops.dagstuhl.de/opus/volltexte/2020/12311
- E. Reiter and R. Dale, Building Natural Language Generation Systems. Cambridge University Press, 2000.
- E. Reiter and R. Dale, “Building applied natural language generation systems,” Natural Language Engineering, vol. 3, 03 2002.
- A. Celikyilmaz, E. Clark, and J. Gao, “Evaluation of text generation: A survey,” CoRR, vol. abs/2006.14799, 2020. [Online]. Available: https://arxiv.org/abs/2006.14799
- IBM, “What is an AI model?” 2024. [Online]. Available: https://www.ibm.com/topics/ai-model
- IBM, “What are Neural Networks?” https://www.ibm.com/topics/neural-networks, 2023.
- IBM, “What are Recurrent Neural Networks?” https://www.ibm.com/topics/recurrent-neural-networks, 2023.
- A. Graves, “Generating sequences with recurrent neural networks,” CoRR, vol. abs/1308.0850, 2013. [Online]. Available: http://arxiv.org/abs/1308.0850
- A. Karpathy, “The unreasonable effectiveness of recurrent neural networks,” 2015. [Online]. Available: http://karpathy.github.io/2015/05/21/rnneffectiveness/
- P. Dugar, “Attention — seq2seq models,” https://towardsdatascience. com/day-1-2-attention-seq2seq-models-65df3f49e263, 2019.
- I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, ser. NIPS’14. Cambridge, MA, USA: MIT Press, 2014, p. 3104–3112.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17. Red Hook, NY, USA: Curran Associates Inc., 2017. ISBN 9781510860964 p. 6000–6010.
- IBM, “What is Unsupervised Learning?” https://www.ibm.com/topics/unsupervised-learning, 2023.
- OpenAI, “OpenAI,” https://www.openai.com/product, 2023.
- T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” CoRR, vol. abs/2005.14165, 2020. [Online]. Available: https://arxiv.org/abs/2005.14165
- OpenAI, “Gpt-4 technical report,” 2023.
- ——, “ChatGPT,” https://openai.com/blog/chatgpt, 2023.
- J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds. Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/n19-1423 pp. 4171–4186. [Online]. Available: https://doi.org/10.18653/v1/n19-1423
- T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” 2020.
- S. Ali, D. DiPaola, and C. Breazeal, “What are gans?: Introducing generative adversarial networks to middle school students,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 17, 2021. [Online]. Available: https://par.nsf.gov/biblio/10252915
- Y. Zhang, Z. Gan, K. Fan, Z. Chen, R. Henao, D. Shen, and L. Carin, “Adversarial feature matching for text generation,” 2017. [Online]. Available: https://arxiv.org/abs/1706.03850
- T. Cemgil, S. Ghaisas, K. Dvijotham, S. Gowal, and P. Kohli, “The autoencoding variational autoencoder,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 15 077–15 087. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/ac10ff1941c540cd87c107330996f4f6-Paper.pdf
- S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Józefowicz, and S. Bengio, “Generating sentences from a continuous space,” CoRR, vol. abs/1511.06349, 2015. [Online]. Available: http://arxiv.org/abs/1511.06349
- PyPI, “OpenAI,” https://pypi.org/project/openai/, 2023.
- TensorFlow, “Why TensorFlow,” https://www.tensorflow.org/about, 2023.
- Keras, “Keras documentation: About Keras,” https://keras.io/about/, 2023.
- J. Terra, “Pytorch Vs Tensorflow vs Keras,” https://www.simplilearn.com/keras-vs-tensorflow-vs-pytorch-article, 2023.
- HuggingFace, “Transformers,” https://huggingface.co/docs/transformers/index, 2023.
- M. Wolf, “Textgenrnn,” https://github.com/minimaxir/textgenrnn, 2020.
- G. Bhatia, “keytotext.” [Online]. Available: https://github.com/gagan3012/keytotext
- Y. Wang, W. Wang, S. Joty, and S. C. Hoi, “CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 8696–8708. [Online]. Available: https://aclanthology.org/2021.emnlp-main.685
- T. C. Freitas, A. Costa Neto, M. J. a. V. Pereira, and P. R. Henriques, “NLP/AI Based Techniques for Programming Exercises Generation,” in 4th International Computer Programming Education Conference (ICPEC 2023), ser. Open Access Series in Informatics (OASIcs), R. A. Peixoto de Queirós and M. P. Teixeira Pinto, Eds., vol. 112. Dagstuhl, Germany: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023. http://dx.doi.org/10.4230/OASIcs.ICPEC.2023.9. ISBN 978-3-95977-290-7. ISSN 2190-6807 pp. 9:1–9:12. [Online]. Available: https://drops.dagstuhl.de/opus/volltexte/2023/18505
- w3resource, “Python exercises, practice, solution,” https://www.w3resource.com/python-exercises/, 2023.
- M. Woolf, “aitextgen,” https://docs.aitextgen.io/, 2021.
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 1, jan 2020.
- J. Austin, A. Odena, M. Nye et al., “Program synthesis with large language models,” arXiv preprint https://arxiv.org/abs/2108.07732, 2021.
- S. Bird, E. Klein, and E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc., 2009.
- H. Husain, H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt, “Codesearchnet challenge: Evaluating the state of semantic code search,” CoRR, vol. abs/1909.09436, 2019. [Online]. Available: http://arxiv.org/abs/1909.09436
- E. Shinan, “Lark.” [Online]. Available: https://lark-parser.readthedocs.io/en/stable/