Paper ID: 2411.08267
Least Squares Training of Quadratic Convolutional Neural Networks with Applications to System Theory
Zachary Yetman Van Egmond, Luis Rodrigues
This paper provides a least squares formulation for the training of a 2-layer convolutional neural network using quadratic activation functions, a 2-norm loss function, and no regularization term. Using this method, an analytic expression for the globally optimal weights is obtained alongside a quadratic input-output equation for the network. These properties make the network a viable tool in system theory by enabling further analysis, such as the sensitivity of the output to perturbations in the input, which is crucial for safety-critical systems such as aircraft or autonomous vehicles.The least squares method is compared to previously proposed strategies for training quadratic networks and to a back-propagation-trained ReLU network. The proposed method is applied to a system identification problem and a GPS position estimation problem. The least squares network is shown to have a significantly reduced training time with minimal compromises on prediction accuracy alongside the advantages of having an analytic input-output equation. Although these results only apply to 2-layer networks, this paper motivates the exploration of deeper quadratic networks in the context of system theory.
Submitted: Nov 13, 2024