Paper ID: 2310.07190
Neural networks: deep, shallow, or in between?
Guergana Petrova, Przemyslaw Wojtaszczyk
We give estimates from below for the error of approximation of a compact subset from a Banach space by the outputs of feed-forward neural networks with width W, depth l and Lipschitz activation functions. We show that, modulo logarithmic factors, rates better that entropy numbers' rates are possibly attainable only for neural networks for which the depth l goes to infinity, and that there is no gain if we fix the depth and let the width W go to infinity.
Submitted: Oct 11, 2023