Redundancy and Implicit Regularization in Neural Network Models
Neural networks can represent a wide variety of functions, and typically they do so redundantly: many distinct parameter settings correspond to the same function. This raises a natural question: why does training select one particular representation rather than another equivalent one? I will address this question in the context of two families of models: deep linear networks (DLNs) and feedforward ReLU networks. The key object of study is the fiber of a function—the set of all parameter configurations that realize it. I will describe the geometry of these fibers in both settings, and in the case of DLNs, I will present joint work in progress with Govind Menon on the characterization of balanced representations.