Abstract
Wepropose a new adversarial training framework– generative adversarial ranking networks (GARNet) to learn from user preferences among a list of samples so as to generate data meeting user-specific criteria. Verbosely, GARNet consists of two modules: a ranker and a generator. The generator fools the ranker to raise generated samples to the top; while the ranker learns to rank generated samples at the bottom. Meanwhile, the ranker learns to rank samples regarding the interested property by training with preferences collected on real samples. The adversarial ranking game between the ranker and the generator enables an alignment between the generated data distribution and the user-preferred data distribution with theoretical guarantees and empirical verification. Specifically, we first prove that when training with full preferences on a discrete property, the learned distribution of GARNet rigorously coincides with the distribution specified by the given score vector based on user preferences. The theoretical results are then extended to partial preferences on a discrete property and further generalized to preferences on a continuous property. Meanwhile, numerous experiments show that GARNet can retrieve the distribution of user-desired data based on full/partial preferences in terms of various interested properties (i.e., discrete/continuous property, single/multiple properties). Code is available at https://github.com/EvaFlower/GARNet.
Original language | English |
---|---|
Number of pages | 35 |
Journal | Journal of Machine Learning Research |
Volume | 25 |
Issue number | 119 |
Publication status | Published - Apr 2024 |
Bibliographical note
The first version of this work was done when the author was a PhD student at Southern University of Science and Technology (SUSTech), China and University of Technology Sydney (UTS), Australia.Keywords
- Generative Adversarial Network
- Controllable Generation
- User Preferences
- Adversarial Ranking
- Relativistic f-Divergence