Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

An Enhancement of Reinforcement Learning by Scheduling with Learning Effects

DOI: http://dx.doi.org/10.15439/2023F4564

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 689697 ()

Full text

Abstract. This paper present results, which reveal that approaches obtained for scheduling problems with learning effects can be successfully used to improve the quality of machine learning methods. It is illustrated by modelling some aspects of Q-learning agents as scheduling problems with the learning effect, and constructing sequencing and dispatching algorithms, which take into account the existence of learning. Their application to determine the sequence of tasks processed by Q-learning agents can visibly speed up their convergence to an optimal strategy. Furthermore, we show that a dispatch of tasks according to the longest processing time algorithm for parallel computing can be replaced by a more efficient procedure, if agents can learn. The numerical analysis reveals that our approach is efficient, robust and only marginally dependents on a learning model and an accurate approximation of task processing times.

References

  1. D. Biskup, “A state-of-the-art review on scheduling with learning effects,” European Journal of Operational Research, vol. 188, pp. 315–329, 2008.
  2. R. Rudek, “Scheduling on parallel processors with varying processing times,” Computers & Operations Research, vol. 81, pp. 90–101, 2017.
  3. J. Xu, C.-C. Wu, Y. Yin, C. Zhao, Y.-T. Chiou, and W.-C. Lin, “An order scheduling problem with position-based learning effect,” Computers & Operations Research, vol. 74, pp. 175–186, 2016.
  4. C. Zhao, J. Fang, T. Cheng, and M. Ji, “A note on the time complexity of machine scheduling with DeJongâĂŹs learning effect,” Computers & Industrial Engineering, vol. 112, pp. 447–449, 2017.
  5. J. Pei, X. Liu, P. M. Pardalos, A. Migdalas, and S. Yang, “Serial-batching scheduling with time-dependent setup time and effects of deterioration and learning on a single-machine,” Journal of Global Optimization, vol. 67, pp. 251–262, 2017.
  6. R. Rudek, “A fast neighborhood search scheme for identical parallel machine scheduling problems under general learning curves,” Applied Soft Computing, vol. 113, pp. 108 023.1–16, 2021.
  7. C.-H. Wu, W.-C. Lee, P.-J. Lai, and J.-Y. Wang, “Some single-machine scheduling problems with elapsed-time-based and position-based learning and forgetting effects,” Discrete Optimization, vol. 19, pp. 1–11, 2016.
  8. M. I. Jordan and T. M. Mitchell, “Machine Learning: Trends, Perspectives, and Prospects,” Science, vol. 349, pp. 255–260, 2015.
  9. I. Grondman, L. Buşoniu, G. Lopes, and R. Babuška, “A survey of actorcritic reinforcement learning: standard and natural policy gradients,” IEEE Transactions On Systems, Man, And Cybernetics - Part C: Applications And Reviews, vol. 42, pp. 1291–1307, 2012.
  10. R. S. Sutton and A. G. Barto, Reinforcement learning: an introduction. Cambridge: MIT Press, 1998.