[1] V. Derhami, F. Alamian Harandi and M. B. Dowlatshahi, Reinforcement Learning. Yazd, Iran, Yazd University Press, 2017.
[2] F. Alamiyan-Harandi, V. Derhami and F. Jamshidi, “A new framework for mobile robot trajectory tracking using depth data and learning algorithms”, Journal of Intelligent & Fuzzy Systems, vol. 34, no. 6, pp. 3969-3982, 2018.
[3] RS. Sutton and AG. Barto, Reinforcement learning: An introduction. 2nd Ed, London, The MIT Press, 2018.
[4] BH. Abed-alguni, “Action-selection method for reinforcement learning based on cuckoo search algorithm”, Arabian Journal for Science and Engineering, vol. 43, no. 12, pp. 6771-6785, 2018.
[5] K. Morihiro, T. Isokawa, N. Matsui and H. Nishimura, “Effects of chaotic exploration on reinforcement learning in target capturing task”, International Journal of Knowledge-based and Intelligent Engineering Systems, vol. 12, no. 5-6, pp. 369-377, 2008.
[6] K. Morihiro, T. Isokawa, N. Matsui and H. Nishimura, “Reinforcement learning by chaotic exploration generator in target capturing task”, proc. International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Springer Berlin Heidelberg, 2005, pp. 1248-1254.
[7] K. Morihiro, N. Matsui and H. Nishimura, “Effects of chaotic exploration on reinforcement maze learning”, Proc. International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Springer Berlin Heidelberg, 2004, pp. 833-839.
[8] K. Morihiro, N. Matsui and H. Nishimura, “Chaotic exploration effects on reinforcement learning in shortcut maze task”, International Journal of Bifurcation and Chaos, vol. 16, no. 10, pp. 3015-3022, 2006.
[9] AB. Potapov and MK. Ali, “Learning, exploration and chaotic policies”, International Journal of Modern Physics C, vol. 11, no.07, pp. 1455-1464, 2000.
[10] E. Pei, J. Jiang, L. Liu, Y. Li and Z. Zhang, “A chaotic Q-learning-based licensed assisted access scheme over the unlicensed spectrum”, IEEE Transactions on Vehicular Technology, vol. 68, no. 10, pp. 9951-9962, 2019.
[11] B. Zarei and MR. Meybodi, “Improving learning ability of learning automata using chaos theory”, The Journal of Supercomputing, vol. 77, no. 1, pp. 652-678, 2021.
[12] EN. Lorenz, “Deterministic nonperiodic flow”, Journal of atmospheric sciences, vol. 20, no. 2, pp. 130-141, 1963.
[13] G. Chen and T. Ueta, “Yet another chaotic attractor”, International Journal of Bifurcation and chaos, vol. 9, no. 07, pp. 1465-1466, 1999.
[14] H. Khodadadi and V. Derhami, “Improving Speed and Efficiency of Dynamic Programming Methods through Chaos”, Journal of AI and Data Mining, vol. 9, no. 4, pp. 487-496, 2021.
[15] M. Mollaeefar, A. Sharif and M. Nazari, “A novel encryption scheme for colored image based on high level chaotic maps”, Multimedia Tools and Applications, vol. 76, pp. 607-629, 2017.
[16] RY. Chen, J. Schulman, P. Abbeel and S. Sidor, “UCB and infogain exploration via q-ensembles”, arXiv:1706.01502, 2017.
[17] M. Tokic, “Adaptive ε-greedy exploration in reinforcement learning based on value differences”, proc. Annual Conference on Artificial Intelligence, Springer Berlin Heidelberg, 2010, pp. 203-210.
[18] M. Tokic and G. Palm, “Value-difference based exploration: adaptive control between epsilon-greedy and softmax”. proc. Annual conference on artificial intelligence, Berlin, 2011, pp. 335-346.
[19] V. Derhami, V. Johari Majd, MN. Ahmadabadi, “Exploration and exploitation balance management in fuzzy reinforcement learning”, Fuzzy sets and systems, vol. 161, no. 4, pp. 578-595, 2010.
[20] YL. He, XL. Zhang, W. Ao and JZ. Huang, “Determining the optimal temperature parameter for Softmax function in reinforcement learning”, Applied Soft Computing, vol. 70, pp.80-85, 2018.
[21] M. Guo, Y. Liu and J. Malec, “A new Q-learning algorithm based on the metropolis criterion”, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 5, pp. 2140-2143, 2004.
[22] C. Chen, D. Dong, HX. Li, J. Chu and TJ. Tarn, “Fidelity-based probabilistic Q-learning for control of quantum systems”, IEEE transactions on neural networks and learning systems, vol. 25, no. 5, pp. 920-933, 2013.
[23] RA. Bianchi, CH. Ribeiro CH and AHR. Costa, “Heuristically Accelerated Reinforcement Learning: Theoretical and Experimental Results”, proc. 20th European Conference on Artificial Intelligence (ECAI), IOS Press, 2012, pp. 169-174.
[24] A. Ecoffet, J. Huizinga, J. Lehman, K.O. Stanley and J. Clune, “First return, then explore”, Nature, vol. 590, no.7847, pp. 580-586, 2021.
[25] T. Lin and A. Jabri, “MIMEx: intrinsic rewards from masked input modeling”, arXiv preprint arXiv:2305.08932, 2023.
[26] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang and W. Zaremba, “OpenAI Gym”, ArXiv:1606.01540, 2016.
[27] Z. Hua and Y. Zhou, “Exponential chaotic model for generating robust chaos”, IEEE transactions on systems, man, and cybernetics, vol. 51, no. 6, pp. 3713-3724, 2019.
[28] AH. Gandomi and XS. Yang,” Chaotic bat algorithm”, Journal of computational science, vol. 5, no. 2, pp. 224-232, 2014.
[29] Jr I. Fister, M. Perc, SM. Kamal and I. Fister, “A review of chaos-based firefly algorithms: perspectives and research challenges”, Applied Mathematics and Computation, vol. 252, pp. 155-165, 2015.
[30] H. Lu, X. Wang, Z. Fei and M. Qiu, “The effects of using chaotic map on improving the performance of multi objective evolutionary algorithms”, Mathematical Problems in Engineering, no. 1, Article ID 924652, 2014.
[31] X. Zhang and Y. Cao, “A novel chaotic map and an improved chaos-based image encryption scheme”, The Scientific World Journal, no. 1, Article ID 713541, 2014.
[32] C. Zhu, “A novel image encryption scheme based on improved hyperchaotic sequences”, Optics communications, vol. 285, no. 1, pp. 29-37, 2012.
[33] A. Rezaee Jordehi, “A chaotic artificial immune system optimization algorithm for solving global continuous optimization problems”, Neural Computing and Applications, vol. 26, pp. 827-833, 2015.
[34] PP. Singh, “A chaotic system with large Lyapunov exponent: Nonlinear observer design and circuit implementation”, In 2020 3rd international conference on energy, power and environment: Towards clean energy technologies, 2021, pp. 1-6.
[35] N. Nguyen, L. Pham-Nguyen, MB. Nguyen and G.
Kaddoum, “A low power circuit design for chaos-key based data encryption”, IEEE Access, vol. 8, pp. 104432-104444, 2020.
[36] KZ. Zamli, F. Din, HS. Alhadawi, “Exploring a Q-learning-based chaotic naked mole rat algorithm for S-box construction and optimization”, Neural Computing and Applications, vol. 35, no. 14, pp. 10449-10471, 2023.
[37] L. Moysis, A. Tutueva, C. Volos, D. Butusov, JM. Munoz-Pacheco, H. Nistazakis, “A two-parameter modified logistic map and its application to random bit generation”, Symmetry, vol. 12, no. 5: 829, 2020.
[38] L. Skanderova, I. Zelinka, “Arnold cat map and sinai as chaotic numbers generators in evolutionary algorithms”, In AETA 2013: Recent Advances in Electrical Engineering and Related Sciences, 2014, pp. 381-389.