TY - JOUR
T1 - Incorporating Diversity and Informativeness in Multiple-Instance Active Learning
AU - WANG, Ran
AU - WANG, Xi-Zhao
AU - KWONG, Sam
AU - XU, Chen
PY - 2017/12
Y1 - 2017/12
N2 - Multiple-instance active learning (MIAL) is a paradigm to collect sufficient training bags for a multiple-instance learning (MIL) problem, by selecting and querying the most valuable unlabeled bags iteratively. Existing works on MIAL evaluate an unlabeled bag by its informativeness with regard to the current classifier, but neglect the internal distribution of its instances, which can reflect the diversity of the bag. In this paper, two diversity criteria, i.e., clustering-based diversity and fuzzy rough set based diversity, are proposed for MIAL by utilizing a support vector machine (SVM) based MIL classifier. In the first criterion, a kernel k-means clustering algorithm is used to explore the hidden structure of the instances in the feature space of the SVM, and the diversity degree of an unlabeled bag is measured by the number of unique clusters covered by the bag. In the second criterion, the lower approximations in fuzzy rough sets are used to define a new concept named dissimilarity degree, which depicts the uniqueness of an instance so as to measure the diversity degree of a bag. By incorporating the proposed diversity criteria with existing informativeness measurements, new MIAL algorithms are developed, which can select bags with both high informativeness and diversity. Experimental comparisons demonstrate the feasibility and effectiveness of the proposed methods.
AB - Multiple-instance active learning (MIAL) is a paradigm to collect sufficient training bags for a multiple-instance learning (MIL) problem, by selecting and querying the most valuable unlabeled bags iteratively. Existing works on MIAL evaluate an unlabeled bag by its informativeness with regard to the current classifier, but neglect the internal distribution of its instances, which can reflect the diversity of the bag. In this paper, two diversity criteria, i.e., clustering-based diversity and fuzzy rough set based diversity, are proposed for MIAL by utilizing a support vector machine (SVM) based MIL classifier. In the first criterion, a kernel k-means clustering algorithm is used to explore the hidden structure of the instances in the feature space of the SVM, and the diversity degree of an unlabeled bag is measured by the number of unique clusters covered by the bag. In the second criterion, the lower approximations in fuzzy rough sets are used to define a new concept named dissimilarity degree, which depicts the uniqueness of an instance so as to measure the diversity degree of a bag. By incorporating the proposed diversity criteria with existing informativeness measurements, new MIAL algorithms are developed, which can select bags with both high informativeness and diversity. Experimental comparisons demonstrate the feasibility and effectiveness of the proposed methods.
KW - Clustering
KW - diversity
KW - fuzzy rough set
KW - multiple-instance active learning (MIAL)
UR - http://www.scopus.com/inward/record.url?scp=85021813482&partnerID=8YFLogxK
U2 - 10.1109/TFUZZ.2017.2717803
DO - 10.1109/TFUZZ.2017.2717803
M3 - Journal Article (refereed)
SN - 1063-6706
VL - 25
SP - 1460
EP - 1475
JO - IEEE Transactions on Fuzzy Systems
JF - IEEE Transactions on Fuzzy Systems
IS - 6
ER -