H.3. Artificial Intelligence
Habib Khodadadi; Vali Derhami
Abstract
The exploration-exploitation trade-off poses a significant challenge in reinforcement learning. For this reason, action selection methods such as ε-greedy and Soft-Max approaches are used instead of the greedy method. These methods use random numbers to select an action that balances exploration ...
Read More
The exploration-exploitation trade-off poses a significant challenge in reinforcement learning. For this reason, action selection methods such as ε-greedy and Soft-Max approaches are used instead of the greedy method. These methods use random numbers to select an action that balances exploration and exploitation. Chaos is commonly utilized across various scientific disciplines because of its features, including non-periodicity, unpredictability, ergodicity and pseudorandom behavior. In this paper, we employ numbers generated by different chaotic systems to select action and identify better maps in diverse states and quantities of actions. Based on our experiments on various environments such as the Multi-Armed Bandit (MAB), taxi-domain, and cliff-walking, we found that many of the chaotic methods increase the speed of learning and achieve higher rewards.
F.2.7. Optimization
M. Mohammadpour; H. Parvin; M. Sina
Abstract
Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes ...
Read More
Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes after. This is the idea underlining the use of memory in this field, what involves key design issues concerning the memory content, the process of update, and the process of retrieval. In this article, we used chaotic genetic algorithm (GA) with memory for solving dynamic optimization problems. A chaotic system has much more accurate prediction of the future rather than random system. The proposed method used a new memory with diversity maximization. Here we proposed a new strategy for updating memory and retrieval memory. Experimental study is conducted based on the Moving Peaks Benchmark to test the performance of the proposed method in comparison with several state-of-the-art algorithms from the literature. Experimental results show superiority and more effectiveness of the proposed algorithm in dynamic environments.