Capability Evolution
Capability evolution in artificial intelligence focuses on understanding and enhancing the abilities of various AI models, particularly large language models (LLMs), across diverse tasks. Current research emphasizes evaluating these capabilities through novel benchmarks and frameworks, often analyzing model performance under incomplete information or with limited data, and exploring the role of factors like data quality and model architecture (e.g., transformers, state space models). This research is crucial for responsible AI development, informing the creation of more robust and reliable systems with applications ranging from robotics and software engineering to education and scientific research.
Papers
Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding
Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Grigori Sidorov, Dietrich Klakow, Philipp Slusallek, Olga Kolesnikova, Seid Muhie Yimam
Subversion Strategy Eval: Evaluating AI's stateless strategic capabilities against control protocols
Alex Mallen, Charlie Griffin, Alessandro Abate, Buck Shlegeris
Post-Hoc MOTS: Exploring the Capabilities of Time-Symmetric Multi-Object Tracking
Gergely Szabó, Zsófia Molnár, András Horváth
Unseen Horizons: Unveiling the Real Capability of LLM Code Generation Beyond the Familiar
Yuanliang Zhang, Yifan Xie, Shanshan Li, Ke Liu, Chong Wang, Zhouyang Jia, Xiangbing Huang, Jie Song, Chaopeng Luo, Zhizheng Zheng, Rulin Xu, Yitong Liu, Si Zheng, Xiangke Liao