Abstract
A proper initialization of parameters in a neural network can facilitate its training. The Xavier initialization introduced by Glorot and Bengio which is later generalized to Kaiming initialization by He, Zhang, Ren and Sun are now widely used. However, from experiments we find that networks with heavy weight sharing are difficulty to train even with the Xavier or the Kaiming initialization. We also notice that a certain simple network can be decomposed in two ways, where one is difficult to train while the other is easy to train, when both are properly initialized by the Xavier or the Kaiming initialization. In this paper we will propose a new initialization method which will increase training speed and training stability of neural networks with heavy weight sharing. We will also propose a simple yet efficient method to adjust learning rates layer by layer which is indispensable to our initialization.
Original language | English |
---|---|
Title of host publication | Mathematical Methods in Image Processing and Inverse Problems, IPIP 2018 |
Editors | Xue-Cheng TAI, Suhua WEI, Haiguang LIU |
Publisher | Springer |
Pages | 165-179 |
Number of pages | 15 |
ISBN (Electronic) | 9789811627019 |
ISBN (Print) | 9789811627002 |
DOIs | |
Publication status | Published - 2021 |
Externally published | Yes |
Event | International Workshop on Image Processing and Inverse Problems, IPIP 2018 - Beijing, China Duration: 21 Apr 2018 → 24 Apr 2018 |
Publication series
Name | Springer Proceedings in Mathematics and Statistics |
---|---|
Volume | 360 |
ISSN (Print) | 2194-1009 |
ISSN (Electronic) | 2194-1017 |
Conference
Conference | International Workshop on Image Processing and Inverse Problems, IPIP 2018 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 21/04/18 → 24/04/18 |
Bibliographical note
Publisher Copyright:© 2021, Springer Nature Singapore Pte Ltd.
Keywords
- Learning rate
- Neural networks
- Weight sharing
- Xavier initialization