Architectural Bias
Architectural bias in deep learning refers to the inherent properties of a model's structure that influence its learning process and generalization capabilities. Current research focuses on understanding and manipulating these biases to improve model performance on tasks involving long sequences, graph data, and relational reasoning, often employing techniques like Hamiltonian dynamics in graph networks and structured initialization in vision transformers. This research is crucial for advancing the robustness and efficiency of deep learning models, particularly in addressing challenges like out-of-distribution generalization and data-efficient learning across diverse applications.
Papers
October 22, 2024
October 16, 2024
July 11, 2024
May 27, 2024
April 1, 2024
March 8, 2024
June 9, 2022