Paper ID: 2312.08257
\emph{Lifted} RDT based capacity analysis of the 1-hidden layer treelike \emph{sign} perceptrons neural networks
Mihailo Stojnic
We consider the memorization capabilities of multilayered \emph{sign} perceptrons neural networks (SPNNs). A recent rigorous upper-bounding capacity characterization, obtained in \cite{Stojnictcmspnncaprdt23} utilizing the Random Duality Theory (RDT), demonstrated that adding neurons in a network configuration may indeed be very beneficial. Moreover, for particular \emph{treelike committee machines} (TCM) architectures with $d\leq 5$ neurons in the hidden layer, \cite{Stojnictcmspnncaprdt23} made a very first mathematically rigorous progress in over 30 years by lowering the previously best known capacity bounds of \cite{MitchDurb89}. Here, we first establish that the RDT bounds from \cite{Stojnictcmspnncaprdt23} scale as $\sim \sqrt{d}$ and can not on their own \emph{universally} (over the entire range of $d$) beat the best known $\sim \log(d)$ scaling of the bounds from \cite{MitchDurb89}. After recognizing that the progress from \cite{Stojnictcmspnncaprdt23} is therefore promising, but yet without a complete concretization, we then proceed by considering the recently developed fully lifted RDT (fl RDT) as an alternative. While the fl RDT is indeed a powerful juggernaut, it typically relies on heavy numerical evaluations. To avoid such heavy numerics, we here focus on a simplified, \emph{partially lifted}, variant and show that it allows for very neat, closed form, analytical capacity characterizations. Moreover, we obtain the concrete capacity bounds that \emph{universally} improve for \emph{any} $d$ over the best known ones of \cite{MitchDurb89}.
Submitted: Dec 13, 2023