The Mean-Field Limit for Shallow Neural Networks: Implications for Trainability and Generalization (Part 1)
Grant Rotskoff, Stanford University
Neural networks with large numbers of parameters have a number of remarkable empirical properties from the perspective of numerical analysis: these parametric models can be optimized reliably without regularization or guarantees of convexity and they also accurately regress very high-dimensional data. In these lectures, I will explore one theoretical explanation of these remarkable properties; first, I will introduce the mean-field limit for neural networks and discuss a corresponding law of large numbers, which ensures convergence of the “training” dynamics to a global optimum. In addition, I will discuss fluctuations and a central limit theorem type result. Subsequently, I will describe modifications of the gradient flow that improve converge and detail some applications.