In the world of big data, large amounts of images are available in social media, corporate and even personal collections. A collection may grow quickly as new images are generated at high rates. The new images may cause changes in the distribution of existing classes or the emergence of new classes, resulting in the collection being dynamic and having concept drift. For efficient image retrieval from an image collection using a query, a hash table consisting of a set of hash functions is needed to transform images into binary hash codes which are used as the basis to find similar images to the query. If the image collection is dynamic, the hash table built at one time step may not work well at the next due to changes in the collection as a result of new images being added. Therefore, the hash table needs to be rebuilt or updated at successive time steps. Incremental hashing (ICH) is the first effective method to deal with the concept drift problem in image retrieval from dynamic collections. In ICH, a new hash table is learned based on newly emerging images only which represent data distribution of the current data environment. The new hash table is used to generate hash codes for all images including old and new ones. Due to the dynamic nature, new images of one class may not be similar to old images of the same class. In order to learn new hash table that preserves within-class similarity in both old and new images, incremental hashing with sample selection using dominant sets (ICHDS) is proposed in this paper, which selects representative samples from each class for training the new hash table. Experimental results show that ICHDS yields better retrieval performance than existing dynamic and static hashing methods.
|Number of pages||14|
|Journal||International Journal of Machine Learning and Cybernetics|
|Early online date||24 Jun 2020|
|Publication status||Published - Dec 2020|
Bibliographical noteThis work was supported in part by the National Natural Science Foundation of China under Grants 61876066, 61772344, and 61672443, the Guangzhou Science and Technology Plan Project under Grant 201804010245, Guangdong Province Science and Technology Plan Project (Collaborative Innovation and Platform Environment Construction) 2019A050510006, EU Horizon 2020 Programme (700381, ASGARD), and the Hong Kong RGC General Research Funds under Grants 9042489 (CityU 11206317), 9042816 (CityU 11209819) and 9042322 (CityU 11200116).
- Concept drift
- Dominant sets
- Image retrieval
- Incremental hashing
- Semi-supervised hashing