Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers - Page 4
Classifying States of the Hopfield Network with Improved Accuracy, Generalization, and Interpretability
Weak-to-Strong Generalization Even in Random Feature Networks, Provably
REAct: Rational Exponential Activation for Better Learning and Generalization in PINNs
CGMatch: A Different Perspective of Semi-supervised Learning
Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Partition Tree Weighting for Non-Stationary Stochastic Bandits
ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
Investigating Generalization of One-shot LLM Steering Vectors