TY - GEN
T1 - Hashing-based Undersampling for Large Scale Histopathology Image Classification
AU - TIAN, Xing
AU - QIU, Lin
AU - LI, Qihua
AU - NG, Wing W. Y.
AU - ZHANG, Jianjun
AU - KWONG, Sam
AU - WANG, Hui
AU - DONG, Xinran
AU - LIU, Baoyi
AU - HU, Yijun
AU - YU, Honghua
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022/12
Y1 - 2022/12
N2 - The early diagnosis of cancer based on histopathology images plays an important role in medical science. Existing techniques generally partition the original histopathology image into small pieces for further classification. However, due to the fact that the number of benign (majority) samples is much larger than that of malignant (minority) samples, the classification is significantly imbalanced which adversely affects classification performance. Undersampling is commonly used to address the class-imbalance problem. However, existing methods are typically time consuming so they are not suitable to handle large-scale and high-dimensional data. In this paper we propose a fast and scalable undersampling method, hashing-based undersampling (HBU), to address class imbalance in large-scale medical image classification. Benign images are hashed and then placed into different buckets according to their locations in the input space. Undersampling is achieved by proportionally selecting benign images from the hash buckets. The HBU method is experimentally evaluated on two real histopathology image datasets, CAMELYON16 and ACDC@LUNGHP, by comparison with existing methods. Experimental results show that the HBU method outperforms six state-of-The-Art methods and is scalable and fast.
AB - The early diagnosis of cancer based on histopathology images plays an important role in medical science. Existing techniques generally partition the original histopathology image into small pieces for further classification. However, due to the fact that the number of benign (majority) samples is much larger than that of malignant (minority) samples, the classification is significantly imbalanced which adversely affects classification performance. Undersampling is commonly used to address the class-imbalance problem. However, existing methods are typically time consuming so they are not suitable to handle large-scale and high-dimensional data. In this paper we propose a fast and scalable undersampling method, hashing-based undersampling (HBU), to address class imbalance in large-scale medical image classification. Benign images are hashed and then placed into different buckets according to their locations in the input space. Undersampling is achieved by proportionally selecting benign images from the hash buckets. The HBU method is experimentally evaluated on two real histopathology image datasets, CAMELYON16 and ACDC@LUNGHP, by comparison with existing methods. Experimental results show that the HBU method outperforms six state-of-The-Art methods and is scalable and fast.
KW - Cancer diagnosis
KW - Class-imbalance
KW - Histopathology image
KW - Undersampling
UR - http://www.scopus.com/inward/record.url?scp=85158893327&partnerID=8YFLogxK
U2 - 10.1109/ICCICC57084.2022.10101547
DO - 10.1109/ICCICC57084.2022.10101547
M3 - Conference paper (refereed)
SN - 9781665490849
SP - 221
EP - 228
BT - Proceedings of the 2022 IEEE 21st International Conference on Cognitive Informatics and Cognitive Computing
A2 - WANG, Yingxu
A2 - PLATANIOTIS, Konstantin N.
A2 - WIDROW, Bernard
A2 - PEDRYCZ, Witold
A2 - KINSNER, Witold
A2 - SPACHOS, Petros
A2 - KWONG, Sam
PB - IEEE
T2 - 2022 IEEE 21st International Conference on Cognitive Informatics & Cognitive Computing
Y2 - 8 December 2022 through 10 December 2022
ER -