Logo PTI Logo FedCSIS

Position and Communication Papers of the 16th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 26

State-of-the-Art Techniques in Artificial Intelligence for Continual Learning: A Review

, ,

DOI: http://dx.doi.org/10.15439/2021F12

Citation: Position and Communication Papers of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 26, pages 2332 ()

Full text

Abstract. Continual learning capabilities are important to Artificial Neural Network in the real world especially with the increasing stream of data. However, it remains a challenge to be achieved because they are prone to catastrophic forgetting. Fixing this problem is critical, so that ANN incrementally learn and improve when deployed to real life situations. In this paper, we did a taxonomy of continual learning first in human by introducing plasticity-stability dilemma and some other learning and forgetting process in the brain. We did a state-of-the-art review of three different approaches to continual learning to mitigate catastrophic forgetting


  1. German, P., Ronald, K., Jose, P., Christopher, K., & Stefan, W. (2019). Continual lifelong learning with neural networks: A review. ScienceDirect- Neural Networks, 113, 54-71. http://dx.doi.org/10.1016/j.neunet.2019.01.012
  2. Z. Chen and B. Liu. (2018). Continual Learning and Catastrophic Forgetting. Morgan & Claypool Publishers. http://dx.doi.org/10.2200/S00832ED1V01Y201802AIM037
  3. Nicolas, M., Gregory, G., & David, F. (2019). Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proceedings of the National Academy of Sciences, 115(44), E10467-E10475. http://dx.doi.org/10.1073/pnas.1803839115
  4. Kirkpatrick, J., Pascanu, R., Rabinowitza, N., Veness, J., Desjardins, G., Rusu, A. A., . . . Hadsell, R. (2018). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, 114(13), 3521–3526. http://dx.doi.org/10.1073/pnas.1611835114
  5. Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., & Tuytelaars, T. (2018). Memory Aware Synapses: Learning what (not) to forget. 15th European Conference on Computer Vision ECCV'18. http://dx.doi.org/10.1007/978-3-030-01219-9_9
  6. Vincenzo, L., Davide, M., & Lorenzo, P. (2019). Fine-Grained Continual Learning. Cornell University: Arxiv.org, 1-12. Retrieved from https://arxiv.org/abs/1907.03799
  7. Pomponi, J., Scardapane, S., Lomonaco, V., & Uncini, A. (2020). Efficient Continual Learning in Neural Networks with Embedding Regularization. ScienceDirect - NeuroComputing, 297, 139-148. http://dx.doi.org/10.1016/j.neucom.2020.01.093
  8. Michael, M., & Neal, C. (1989). Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. ScienceDirect- The Psychology of Learning and Motivation, 24, 109-165. http://dx.doi.org/10.1016/S0079-7421(08)60536-8
  9. Andrew, P., Ryan, C., Patrick, M., Stephen, B., Renee, E., & MarioAguilar-Simon. (2019). Uncertainty-based modulation for lifelong learning. ScienceDirect - Neural Networks, 120, 129-142. http://dx.doi.org/10.1016/j.neunet.2019.09.011
  10. Zenke, F., Poole, B., & Ganguli, S. (2017). Continual Learning Through Synaptic Intelligence. Proceedings of the 34 th International Conference on Machine Learning, PMLR 70, 70, pp. 3987–3995. Sydney, Australia. http://dx.doi.org/10.5555/ 3305890.3306093
  11. De, L. M., Rahaf, A., Marc, M., Sarah, P., Xu, J., Ales, L., . . . Tinne, T. (2019). Continual learning: A comparative study on how to defy forgetting in classification tasks. Cornell University: arxiv.org, 26. http://dx.doi.org/10.1109/TPAMI.2021.3057446
  12. Heechul, J., Jeongwoo, J., Minju, J., & Junmo, K. (2016). Less-forgetting Learning in Deep Neural Networks. IEEE, 1-5. Retrieved from https://arxiv.org/abs/1607.00122
  13. M.Stark, S., & E.L.Stark, C. (2016). Chapter 67 - Introduction to Memory. Academic Press. http://dx.doi.org/10.1016/B978-0-12-407794-2.00067-5
  14. Magee, J. C., & Grienberger, C. (2020). Synaptic Plasticity Forms and Functions. Annual Review of Neuroscience, 43, 95-117. http://dx.doi.org/10.1146/annurev-neuro-090919-022842
  15. Quentin, R., Awosika, O., & Leonardo, G. C. (2019). Plasticity and recovery of function. ScienceDirect: Handbook of Clinical Neurology, 163, 473-483. http://dx.doi.org/10.1016/B978-0-12-804281-6.00025-2.
  16. Wickliffe, C. A., & Robins, A. (2005). Memory retention – the synaptic stability versus plasticity dilemma. ScienceDirect. http://dx.doi.org/10.1016/j.tins.2004.12.003
  17. Junichiro, H., Junichiro, Y., & Shin, I. (2006). Balancing Plasticity and Stability of On-Line Learning Based on Hierarchical Bayesian Adaptation of Forgetting Factors. ScienceDirect- NeuroComputing, 69(16-18), 1954-1961. http://dx.doi.org/10.1016/j.neucom.2005.11.020
  18. Sehgal, M., Song, C., L.Ehlers, V., & R.MoyerJr., J. (2013). Learning to learn – Intrinsic plasticity as a metaplasticity mechanism for memory formation. Neurobiology of Learning and Memory, 105, 186-199. http://dx.doi.org/10.1016/j.nlm.2013.07.008
  19. Chaudhry, A., Rohrbach, M., Elhoseiny, M., Ajanthan, T., Dokania, P. K., Torr, P. H., & Ranzato, M. (2019). On Tiny Episodic Memories in Continual Learning. Cornell University, 1-15. Retrieved from https://arxiv.org/abs/1902.10486
  20. Lomonaco, V. (2019). Continual Learning with Deep Architectures. Bologna: Department of Computer Science and Engineering, University of Bologna.
  21. Li, Z., & Hoiem, D. (2016). Learning without Forgetting. The 14th European Conference on Computer Vision ECCV2016. http://dx.doi.org/10.1109/TPAMI.2017.2773081
  22. Ajemiana, R., D’Ausilio, A., Moorman, H., & Bizzi, E. (2013). A theory for how sensorimotor skills are learned and retained in noisy and nonstationary neural circuits. Proceeding of the National Academy of Sciences of the United States of America, 5078-5087. http://dx.doi.org/10.1073/pnas.1320116110
  23. D.O., Hebbs. (1949). The organization of behavior; a neuropsychological theory. Psychology Press. http://dx.doi.org/10.1007/978-3-642-70911-1_15
  24. Zenke, F., & Gerstner, W. (2017). Hebbian plasticity requires compensatory processes on multiple timescales. Philosophical Transactions of The Royal Society B Biological Sciences, 372(1715). http://dx.doi.org/10.1098/rstb.2016.0259
  25. Martin, S. J., Grimwood, P. D., & Morris, R. G. (2000). Synaptic Plasticity and Memory: An Evaluation of the Hypothesis. Annual Review of Neuroscience, 23, 649-711. http://dx.doi.org/10.1146/annurev.neuro.23.1.649
  26. Nicolas Y. Masse, Gregory D. Grant, and David J. Freedman. (2019). Alleviating Catastrophic Forgetting using Context-Dependent Gating and Synaptic Stabilization. Cornell University - arxiv.org. http://dx.doi.org/10.1073/pnas.1803839115
  27. Steven J.Cooper. (2005). Donald O. Hebb’s synapse and learning rule: a history and commentary. Neuroscience and Biobehavioral Reviews, 28, 851-874. http://dx.doi.org/10.1016/j.neubiorev.2004.09.009
  28. Abraham, W. C., Jones, O. D., & Glanzman, D. L. (2019). Is plasticity of synapses the mechanism of long-term memory storage? Nature Partner Journal- Science of Learning, 4, 9. http://dx.doi.org/10.1038/s41539-019-0048-y
  29. German, P., Ronald, K., Jose, P., Christopher, K., & Stefan, W. (2019). Continual lifelong learning with neural networks: A review. ScienceDirect- Neural Networks, 113, 54-71. http://dx.doi.org/10.1016/j.neunet.2019.01.012
  30. Lee, S.-W., Kim, J.-H., Jun, J., Ha, J.-W., & Zhang, B.-T. (2017). Overcoming Catastrophic Forgetting by Incremental Moment Matching. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, Califonia, USA. http://dx.doi.org/10.5555/3294996.3295218
  31. Toneva, M., Sordoni, A., Tachet, d. C., Trischler, A., Bengio, Y., & Geoffrey, J. G. (2019). An Empirical Study of Example Forgetting During Deep Neural Network Learning. The International Conference on Learning Representations (ICLR) 2019. Retrieved from https://arxiv.org/abs/1812.05159
  32. Fiona M Richardson, Michael S C Thomas(2008). Critical periods and catastrophic interference effects in the development of self-organizing feature maps. Developmental Science, 371–389. http://dx.doi.org/10.1111/j.1467-7687.2008.00682.x
  33. Lopez-Paz, D., & Ranzato, M. (2016). Gradient Episodic Memory for Continual Learning. Facebook Artificial Intelligence Research, 1-17. http://dx.doi.org/10.5555/3295222.3295393
  34. Rahaf, A. (2019). Continual Learning in Neural Networks. Leuvan, Belgium: KU Leuven – Faculty of Engineering Science:. Retrieved from https://arxiv.org/abs/1910.02718v2
  35. Pascanu, R., Teh, Y., Pickett, M., & Ring, M. (2018). Continual Learning. Conference on Neural Information Processing Systems. Montréal, Canada: NeurIPS.
  36. Liu, X., Masana, M., Herranz, L., Weijer, J. V., Lopez, A. M., & Bagdanov, A. D. (2018). Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting. International Conference on Pattern Recognition'18. http://dx.doi.org/10.1109/ICPR.2018.8545895
  37. Nguyen, C. V., Li, Y., Bui, T. D., & Turner, R. E. (2018). Variational Continual Learning. International Conference on Learning Representations (ICLR). http://dx.doi.org/10.17863/CAM.35471
  38. Adel, T., Zhao, H., & Turner, R. E. (2020). Continual Learning with Adaptive Weights. The International Conference on Learning Representations (ICLR). Retrieved from https://openreview.net/forum?id=Hklso24Kwr
  39. Serrà, J., Surís, D., Miron, M., & Karatzoglou, A. (2018). Overcoming Catastrophic Forgetting with Hard Attention to the Task. International Conference on Machine Learning (ICML 2018). Retrieved from https://arxiv.org/abs/1801.01423
  40. Hinton, G., Vinyals, O., & Dean, J. (2014). Distilling the Knowledge in a Neural Network. NIPS 2014 Deep Learning Workshop: Neural and Evolutionary Computing. Retrieved from https://arxiv.org/abs/1503.02531
  41. Ju, X., & Zhanxing, Z. (2019). Reinforced Continual Learning. Cornell University, 1-10. http://dx.doi.org/10.5555/3326943.3327027
  42. Lee, S.-W., Kim, J.-H., Jun, J., Ha, J.-W., & Zhang, B.-T. (2017). Overcoming Catastrophic Forgetting by Incremental Moment Matching. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, Califonia, USA. http://dx.doi.org/10.5555/3294996.3295218
  43. Rusu, A. A., Rabinowitz, N. C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., … Hadsell, R. (2016). Progressive Neural Network. Google DeepMind: https://arxiv.org/abs/1606.04671, 1-14. Retrieved from https://arxiv.org/abs/1606.04671
  44. Jary, P., Simone, S., Vincenzo, L., & Aurelio, U. (2020). Efficient Continual Learning in Neural Networks with Embedding Regularization. ScienceDirect- Neurocomputing, 397, 139-148. http://dx.doi.org/10.1016/j.neucom.2020.01.093
  45. Richard, K., Botond, C., Alexej, K., der, S. P., & Stephan, G. (2020). Continual Learning with Bayesian Neural Networks for Non-Stationary Data. International Conference on Learning Representations. Virtual Conference. Retrieved from https://arxiv.org/abs/1910.04112
  46. Kemker, R., & Kanan, C. (2018). FearNet: Brain-Inspired Model for Incremental Learning. The Sixth International Conference on Learning Representations. Vancouver, Canada. Retrieved from https://arxiv.org/abs/1711.10563
  47. Miltiadis, P., Jenny, B.-P., Akka, Z., Boris, M., & de, R. A. (2020). Move to-Data: A new Continual Learning approach with Deep CNNs, Application for image-class recognition. hal-02865878v1f. Retrieved from https://arxiv.org/abs/2006.07152
  48. Arslan, C., Marc’Aurelio, R., Marcus, R., & Mohamed, E. (2019). Efficient Lifelong Learning with A-GEM. International Conference on Learning Representations (ICLR). New Orleans. Retrieved from https://arxiv.org/abs/1812.00420
  49. Rebuffi, S.-A., Kolesnikov, A., Sperl, G., & Lampert, C. H. (2017). iCaRL: Incremental Classifier and Representation Learning. Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii. doi:iCaRL: Incremental Classifier and Representation Learning
  50. Michalis K. Titsias, Jonathan Schwarz, Alexander G. de G. Matthews, Razvan Pascanu, Yee Whye Teh(2020). Functional Regularisation for Continual Learning with Gaussian Processes. International Conference on Learning Representations. Virtual Conference. Retrieved from https://arxiv.org/abs/1901.11356
  51. Richards, B. A., & Frankland, P. W. (2017). The Persistence and Transience of Memory. Cell Press journal, 94(6), 1071-1084. http://dx.doi.org/10.1016/j.neuron.2017.04.037