Improving the accuracy of transformer dissolved gas analysis is always an important demand for power companies. However, the requirement for large numbers of fault samples becomes an obstacle to this demand. This article creatively uses a large number of health data, which is much easier to obtain by power companies, to improve diagnosis accuracy. Comprehensive investigations from the view of both data set and methodology to deal with this problem are presented. A data set consists of 9595 health samples and 993 fault samples is used for analysis. The characteristics of the data set and the influence of the health data on diagnostic accuracy are discussed. The performance of many state-of-art algorithms that handle the imbalanced problem is evaluated. Meanwhile, an efficient fault diagnosis algorithm named self-paced ensemble (SPE) is presented. In SPE, classification hardness is proposed to include the data characteristic in the classification. This method can guarantee the diversity of the data set and keep high performance. According to the experiment results, the superior of SPE is confirmed and also proves that involving more health samples can improve transformer diagnosis when fault data are limited. © 2020 The Authors. High Voltage published by John Wiley & Sons Ltd on behalf of the Institution of Engineering and Technology and China Electric Power Research Institute.