TY - GEN
T1 - On evolving robust strategies for iterated prisoner’s dilemma
AU - DARWEN, P.J.
AU - YAO, X.
PY - 1995
Y1 - 1995
N2 - Evolution is a fundamental form of adaptation in a dynamic and complex environment. Genetic algorithms are an effective tool in the empirical study of evolution. This paper follows Axelrod’s work [2] in using the genetic algorithm to evolve strategies for playing the game of Iterated Prisoner’s Dilemma, using co-evolution, where each member of the population (each strategy) is evaluated by how it performs against the other members of the current population. This creates a dynamic environment in which the algorithm is optimising to a moving target instead of the usual evaluation against some fixed set of strategies. The hope is that this will stimulate an “arms race” of innovation [3]. We conduct two sets of experiments. The first set investigates what conditions evolve the best strategies. The second set studies the robustness of the strategies thus evolved, that is, are the strategies useful only in the round robin of its population or are they effective against a wide variety of opponents? Our results indicate that the population has nearly always converged by about 250 generations, by which time the bias in the population has almost always stabilised at 85%. Our results confirm that cooperation almost always becomes the dominant strategy [1, 2]. We can also confirm that seeding the population with expert strategies is best done in small amounts so as to leave the initial population with plenty of genetic diversity [7]. The lack of robustness in strategies produced in the round robin evaluation is demonstrated by some examples of a population of naive cooperators being exploited by a defect-first strategy. This causes a sudden but ephemeral decline in the population's average score, but it recovers when Iess naive cooperators emerge and do well against the exploiting strategies. This example of runaway evo|ution is brought back to reality by a suitable mutation, reminiscent of punctuated equilibria [1,2]. We find that a way to reduce such na'ivity is to make the GA population play against an extra, static, high-quality strategy (not part of the GA population), as well as all the rest of the population. The strategies thus produced perform better against opponents that were included in the round robin (as expected) and, more significantly, better against opponents that were not included. That is, robustness is improved. © Springer-Verlag Berlin Heidelberg 1995.
AB - Evolution is a fundamental form of adaptation in a dynamic and complex environment. Genetic algorithms are an effective tool in the empirical study of evolution. This paper follows Axelrod’s work [2] in using the genetic algorithm to evolve strategies for playing the game of Iterated Prisoner’s Dilemma, using co-evolution, where each member of the population (each strategy) is evaluated by how it performs against the other members of the current population. This creates a dynamic environment in which the algorithm is optimising to a moving target instead of the usual evaluation against some fixed set of strategies. The hope is that this will stimulate an “arms race” of innovation [3]. We conduct two sets of experiments. The first set investigates what conditions evolve the best strategies. The second set studies the robustness of the strategies thus evolved, that is, are the strategies useful only in the round robin of its population or are they effective against a wide variety of opponents? Our results indicate that the population has nearly always converged by about 250 generations, by which time the bias in the population has almost always stabilised at 85%. Our results confirm that cooperation almost always becomes the dominant strategy [1, 2]. We can also confirm that seeding the population with expert strategies is best done in small amounts so as to leave the initial population with plenty of genetic diversity [7]. The lack of robustness in strategies produced in the round robin evaluation is demonstrated by some examples of a population of naive cooperators being exploited by a defect-first strategy. This causes a sudden but ephemeral decline in the population's average score, but it recovers when Iess naive cooperators emerge and do well against the exploiting strategies. This example of runaway evo|ution is brought back to reality by a suitable mutation, reminiscent of punctuated equilibria [1,2]. We find that a way to reduce such na'ivity is to make the GA population play against an extra, static, high-quality strategy (not part of the GA population), as well as all the rest of the population. The strategies thus produced perform better against opponents that were included in the round robin (as expected) and, more significantly, better against opponents that were not included. That is, robustness is improved. © Springer-Verlag Berlin Heidelberg 1995.
UR - http://www.scopus.com/inward/record.url?scp=84958976798&partnerID=8YFLogxK
U2 - 10.1007/3-540-60154-6_61
DO - 10.1007/3-540-60154-6_61
M3 - Conference paper (refereed)
SN - 9783540601548
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 276
EP - 292
BT - Progress in Evolutionary Computation : AI '93 and AI '94 Workshops on Evolutionary Computation, Melbourne, Victoria, Australia, November 16, 1993, Armidale, NSW, Australia, November 21-22, 1994. Selected Papers
A2 - YAO, Xin
PB - Springer
ER -