State-Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning

Ziqi WANG, Tianye SHU, Jialin LIU*

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

1 Citation (Scopus)

Abstract

In this letter, we revisit endless online level generation with the recently proposed experience-driven procedural content generation via reinforcement learning (EDRL) framework. Inspired by an observation that EDRL tends to generate recurrent patterns, we formulate a notion of state-space closure, which makes any stochastic state appeared possibly in an infinite-horizon online generation process, that can be found within a finite horizon. Through theoretical analysis, we find that even though state-space closure arises a concern about diversity, it generalizes EDRL trained with a finite horizon to the infinite-horizon scenario without deterioration of content quality. Moreover, we verify the quality and the diversity of contents generated by EDRL via empirical studies on the widely used Super Mario Bros. benchmark. Experimental results reveal that the diversity of levels generated by EDRL is limited due to the state-space closure, whereas their quality does not deteriorate in a horizon that is longer than the one specified in the training. Concluding our outcomes and analysis, future work on endless online level generation via reinforcement learning should address the issue of diversity while assuring the occurrence of state-space closure and quality.
Original languageEnglish
Pages (from-to)489-492
Number of pages4
JournalIEEE Transactions on Games
Volume16
Issue number2
Early online date28 Mar 2023
DOIs
Publication statusPublished - Jun 2024
Externally publishedYes

Keywords

  • Content diversity
  • online level generation (OLG)
  • PCG via reinforcement learning (RL)
  • platformer games
  • procedural content generation (PCG)

Fingerprint

Dive into the research topics of 'State-Space Closure: Revisiting Endless Online Level Generation via Reinforcement Learning'. Together they form a unique fingerprint.

Cite this