Abstract
We propose a new adversarial training framework – generative adversarial ranking networks (GARNet) to learn from user preferences among a list of samples so as to generate data meeting user-specific criteria. Verbosely, GARNet consists of two modules: a ranker and a generator. The generator fools the ranker to raise generated samples to the top; while the ranker learns to rank generated samples at the bottom. Meanwhile, the ranker learns to rank samples regarding the interested property by training with preferences collected on real samples. The adversarial ranking game between the ranker and the generator enables an alignment between the generated data distribution and the user-preferred data distribution with theoretical guarantees and empirical verification. Specifically, we first prove that when training with full preferences on a discrete property, the learned distribution of GARNet rigorously coincides with the distribution specified by the given score vector based on user preferences.
| Original language | English |
|---|---|
| Number of pages | 35 |
| Journal | Journal of Machine Learning Research |
| Volume | 25 |
| Issue number | 119 |
| Publication status | Published - Apr 2024 |
Bibliographical note
The first version of this work was done when the author was a PhD student at Southern University of Science and Technology (SUSTech), China and University of Technology Sydney (UTS), Australia.Publisher Copyright: ©2024 Yinghua Yao, Yuangang Pan, Jing Li, Ivor W. Tsang and Xin Yao.
Keywords
- Generative Adversarial Network
- Controllable Generation
- User Preferences
- Adversarial Ranking
- Relativistic f-Divergence