An Efficient Application of Neuroevolution for Competitive Multiagent Learning
DOI:
https://doi.org/10.14738/tmlai.93.10149Keywords:
Genetic Algorithm, NeuroEvolution, Neural Networks, Reinforcement Learning, Multiagent EnvironmentAbstract
Multiagent systems provide an ideal environment for the evaluation and analysis of real-world problems using reinforcement learning algorithms. Most traditional approaches to multiagent learning are affected by long training periods as well as high computational complexity. NEAT (NeuroEvolution of Augmenting Topologies) is a popular evolutionary strategy used to obtain the best performing neural network architecture often used to tackle optimization problems in the field of artificial intelligence. This paper utilizes the NEAT algorithm to achieve competitive multiagent learning on a modified pong game environment in an efficient manner. The competing agents abide by different rules while having similar observation space parameters. The proposed algorithm utilizes this property of the environment to define a singular neuroevolutionary procedure that obtains the optimal policy for all the agents. The compiled results indicate that the proposed implementation achieves ideal behaviour in a very short training period when compared to existing multiagent reinforcement learning models.
References
[2]. Russell, S. J., & Norvig, P. (2003). Artificial Intelligence A Modern Approach; PearsonEducation. Artificial Intelligence: A Modern Approach: Pearson Education.
[3]. Evans, R., & Gao, J. (2016). Deepmind AI reduces Google data centre cooling bill by 40%. DeepMind blog, 20, 158.
[4]. Diallo, E. A. O., Sugiyama, A., & Sugawara, T. (2017, December). Learning to coordinate with deep reinforcement learning in doubles pong game. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 14-19). IEEE.
[5]. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
[6]. D’addona, D. M., & Teti, R. (2013). Genetic algorithm-based optimization of cutting parameters in turning processes. Procedia Cirp, 7, 323-328.
[7]. Whitley, D. (1994). A genetic algorithm tutorial. Statistics and computing, 4(2), 65-85.
[8]. Michalewicz, Z. (2013). Genetic algorithms+ data structures= evolution programs. Springer Science & Business Media.
[9]. Kubota, N., Shimojima, K., & Fukuda, T. (1996, September). Virus-evolutionary genetic algorithm-coevolution of planar grid model. In Proceedings of IEEE 5th International Fuzzy Systems (Vol. 1, pp. 232-238). IEEE.
[10]. Roeva, O., Pencheva, T., Tzonkov, S., Arndt, M., Hitzmann, B., Kleist, S., ... & Flaschel, E. (2007). Multiple model approach to modelling of Escherichia coli fed-batch cultivation extracellular production of bacterial phytase. Electronic Journal of Biotechnology, 10(4), 592-603.
[11]. Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 26(1), 29-41.
[12]. Waibel, M., Floreano, D., Magnenat, S., & Keller, L. (2006). Division of labour and colony efficiency in social insects: effects of interactions between genetic architecture, colony kin structure and rate of perturbations. Proceedings of the Royal Society B: Biological Sciences, 273(1595), 1815-1823.
[13]. Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87(9), 1423-1447.
[14]. Gomez, F., Schmidhuber, J., Miikkulainen, R., & Mitchell, M. (2008). Accelerated Neural Evolution through Cooperatively Coevolved Synapses. Journal of Machine Learning Research, 9(5).
[15]. Peng, Y., Chen, G., Zhang, M., & Mei, Y. (2017, November). Effective Policy Gradient Search for Reinforcement Learning Through NEAT Based Feature Extraction. In Asia-Pacific Conference on Simulated Evolution and Learning (pp. 473-485). Springer, Cham.
[16]. Stanley, K. O., Bryant, B. D., & Miikkulainen, R. (2005). Real-time neuroevolution in the NERO video game. IEEE transactions on evolutionary computation, 9(6), 653-668.
[17]. Fogel, D. B. (2001). Blondie24: Playing at the Edge of AI. Elsevier.
[18]. Floreano, D., & Urzelai, J. (2000). Evolutionary robots with on-line self-organization and behavioral fitness. Neural Networks, 13(4-5), 431-443.
[19]. Kohl, N., Stanley, K., Miikkulainen, R., Samples, M., & Sherony, R. (2006, July). Evolving a real-world vehicle warning system. In Proceedings of the 8th annual conference on Genetic and evolutionary computation (pp. 1681-1688).
[20]. Gomez, F. J., & Miikkulainen, R. (2003, July). Active guidance for a finless rocket using neuroevolution. In Genetic and Evolutionary Computation Conference (pp. 2084-2095). Springer, Berlin, Heidelberg.
[21]. Merlevede, J., van Lon, R. R., & Holvoet, T. (2014). Neuroevolution of a multi-agent system for the dynamic pickup and delivery problem. In International Joint Workshop on Optimisation in Multi-Agent Systems and Distributed Constraint Reasoning (co-located with AAMAS), Date: 2014/05/01-2014/05/01, Location: Paris, France.
[22]. Miikkulainen, R., Feasley, E., Johnson, L., Karpov, I., Rajagopalan, P., Rawal, A., & Tansey, W. (2012, June). Multiagent learning through neuroevolution. In IEEE World Congress on Computational Intelligence (pp. 24-46). Springer, Berlin, Heidelberg.
[23]. Haldurai, L., Madhubala, T., & Rajalakshmi, R. (2016). A study on genetic algorithm and its applications. International Journal of Computer Sciences and Engineering, 4(10), 139.
[24]. Eiben, A. E., & Smith, J. E. (2003). Introduction to evolutionary computing (Vol. 53, p. 18). Berlin: springer.
[25]. Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99-127.
[26]. Mohabeer, H., & Soyjaudah, K. M. S. (2013). Improving the Performance of NEAT Related Algorithm via Complexity Reduction in Search Space. In Distributed Computing and Artificial Intelligence (pp. 1-7). Springer, Cham.
[27]. Makarov, I., Kashin, A., & Korinevskaya, A. (2017). Learning to Play Pong Video Game via Deep Reinforcement Learning. In AIST (Supplement) (pp. 236-241).
[28]. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
[29]. Menon, A. R. & Menon, U. R. QuadroPong, URL: https://github.com/axe76/QuadroPong.