Paper ID: 2409.12446
Neural Networks Generalize on Low Complexity Data
Sourav Chatterjee, Timothy Sudijono
We show that feedforward neural networks with ReLU activation generalize on low complexity data, suitably defined. Given i.i.d. data generated from a simple programming language, the minimum description length (MDL) feedforward neural network which interpolates the data generalizes with high probability. We define this simple programming language, along with a notion of description length of such networks. We provide several examples on basic computational tasks, such as checking primality of a natural number, and more. For primality testing, our theorem shows the following. Suppose that we draw an i.i.d. sample of $\Theta(N^{\delta}\ln N)$ numbers uniformly at random from $1$ to $N$, where $\delta\in (0,1)$. For each number $x_i$, let $y_i = 1$ if $x_i$ is a prime and $0$ if it is not. Then with high probability, the MDL network fitted to this data accurately answers whether a newly drawn number between $1$ and $N$ is a prime or not, with test error $\leq O(N^{-\delta})$. Note that the network is not designed to detect primes; minimum description learning discovers a network which does so.
Submitted: Sep 19, 2024