Paper ID: 2410.19195

Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models

Yue Li, Zhixue Zhao, Carolina Scarton

In-context learning (ICL) performance is known to be sensitive to the prompt design, yet the impact of class label options in zero-shot classification has been largely overlooked. This study presents the first comprehensive empirical study investigating how label option (e.g., lexical choice, order, and elaboration) influences zero-shot ICL classification performance. Our findings reveal that lexical choices for label names (e.g., agree vs.support in stance classification) play an important role, with effects also linked to label orders. An analysis of the model internal states further shows that optimal label names tend to activate fewer outlier neurons in the feed forward network. Based on this observation, we propose Label set Optimization via Activation Distribution kurtosiS (LOADS), a post-hoc approach requiring no gradient propagation. LOADS not only demonstrates effectiveness with only 100 unlabelled samples across different model types and sizes, but also shows cross-lingual transferability.

Submitted: Oct 24, 2024