Abstract
Multi-label active learning (MLAL) reduces the cost of manual annotation for multi-label problems by selecting high-quality unlabeled data. Existing MLAL methods usually perform a one-way selection by considering the examples’ informativeness, representativeness, or diversity. These methods acknowledge only the importance of the examples to the labels, but not vice versa. Due to the imbalanced nature of multi-label data, the selected dataset might also be highly imbalanced, causing negative effects on the learning performance. In this paper, we treat the selection of example-label pairs in MLAL as a two-way matching problem instead of a one-way selection problem. First, the label's preference for example, defined as the informativeness of the example regarding the label, as well as the example's preference for the label, which is defined as the probability of the example belonging to a positive class, are both considered. Then, a simple and effective stable matching (STM) model is adopted to realize the two-way selection. In addition, to provide reasonable candidates for the STM model, the roulette algorithm is utilized to allocate the annotation number for sub-classifiers. Comprehensive experiments demonstrate the competitiveness of the proposed approach and its effectiveness in selecting a relatively balanced dataset.
Original language | English |
---|---|
Pages (from-to) | 281-299 |
Number of pages | 19 |
Journal | Information Sciences |
Volume | 610 |
Early online date | 3 Aug 2022 |
DOIs | |
Publication status | Published - Sept 2022 |
Externally published | Yes |
Keywords
- Active learning
- Example-label pairs
- Imbalanced data
- Multi-label
- Stable matching