TY - GEN
T1 - An Initial Study on the Relationship between Meta Features of Dataset and the Initialization of NNRW
AU - CAO, Weipeng
AU - PATWARY, Muhammed J.A.
AU - YANG, Pengfei
AU - WANG, Xizhao
AU - MING, Zhong
N1 - We really thank the editor and anonymous reviewers for their invaluable suggestions to help us improve this paper. This work was supported by the National Natural Science Foundation of China under Grant nos. 61672358 and 61836005.
PY - 2019/7
Y1 - 2019/7
N2 - The initialization of neural networks with random weights (NNRW) has a significant impact on model performance. However, there is no suitable way to solve this problem so far. In this paper, the relationship between meta features of a dataset and the initialization of NNRW is studied. Specifically, we construct seven regression datasets with known attributes' distributions, then initialize NNRW with different distributions and trained them based on the datasets to get the corresponding models respectively. The relationship between the attributes' distributions of the datasets and the initialization of NNRW is analyzed by the performance of the models. Several interesting phenomena are observed: firstly, initializing NNRW with the Gaussian distribution can help the model to have a faster convergence rate than ones with the Gamma and Uniform distribution. Secondly, if one or more attributes in a dataset that follow the Gamma distribution, using Gamma distribution to initialize NNRW may result in a slower convergence rate and easy overfitting. Thirdly, initializing NNRW with a specific distribution with smaller variances can always achieve faster convergence rate and better generalization performance than the one with larger variances. The above experimental results are not sensitive to the activation function and the type of NNRW. Some theoretical analyses about the above observations are also given in the study.
AB - The initialization of neural networks with random weights (NNRW) has a significant impact on model performance. However, there is no suitable way to solve this problem so far. In this paper, the relationship between meta features of a dataset and the initialization of NNRW is studied. Specifically, we construct seven regression datasets with known attributes' distributions, then initialize NNRW with different distributions and trained them based on the datasets to get the corresponding models respectively. The relationship between the attributes' distributions of the datasets and the initialization of NNRW is analyzed by the performance of the models. Several interesting phenomena are observed: firstly, initializing NNRW with the Gaussian distribution can help the model to have a faster convergence rate than ones with the Gamma and Uniform distribution. Secondly, if one or more attributes in a dataset that follow the Gamma distribution, using Gamma distribution to initialize NNRW may result in a slower convergence rate and easy overfitting. Thirdly, initializing NNRW with a specific distribution with smaller variances can always achieve faster convergence rate and better generalization performance than the one with larger variances. The above experimental results are not sensitive to the activation function and the type of NNRW. Some theoretical analyses about the above observations are also given in the study.
KW - extreme learning machine
KW - meta feature
KW - Neural networks with random weights
KW - random vector functional link network
UR - http://www.scopus.com/inward/record.url?scp=85073263682&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2019.8852219
DO - 10.1109/IJCNN.2019.8852219
M3 - Conference paper (refereed)
AN - SCOPUS:85073263682
T3 - International Joint Conference on Neural Networks (IJCNN)
SP - 2092
EP - 2099
BT - Proceedings of 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - IEEE
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
Y2 - 14 July 2019 through 19 July 2019
ER -