Abstract
Effective exploration is key to a successful search process. The recently proposed negatively correlated search (NCS) tries to achieve this by coordinated parallel exploration, where a set of search processes are driven to be negatively correlated so that different promising areas of the search space can be visited simultaneously. Despite successful applications of NCS, the negatively correlated search behaviors were mostly devised by intuition, while deeper (e.g., mathematical) understanding is missing. In this paper, a more principled NCS, namely NCNES, is presented, showing that the parallel exploration is equivalent to a process of seeking probabilistic models that both lead to solutions of high quality and are distant from previous obtained probabilistic models. Reinforcement learning, for which exploration is of particular importance, are considered for empirical assessment. The proposed NCNES is applied to directly train a deep convolution network with 1.7 million connection weights for playing Atari games. Empirical results show that the significant advantages of NCNES, especially on games with uncertain and delayed rewards, can be highly owed to the effective parallel exploration ability.
Original language | English |
---|---|
Article number | 155333 |
Journal | Frontiers of Computer Science |
Volume | 15 |
Issue number | 5 |
Early online date | 16 Jul 2021 |
DOIs | |
Publication status | Published - Oct 2021 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2021, The Author(s).
Funding
This work was supported by the Natural Science Foundation of China (Grant Nos. 61806090 and 61672478), Guangdong Provincial Key Laboratory (2020B121201001), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (2017ZT07X386), the Science and Technology Commission of Shanghai Municipality (19511120600), and Shenzhen Science and Technology Program (KQTD2016112514355531).
Keywords
- evolutionary computation
- exploration
- reinforcement learning