Capability Evolution
Capability evolution in artificial intelligence focuses on understanding and enhancing the abilities of various AI models, particularly large language models (LLMs), across diverse tasks. Current research emphasizes evaluating these capabilities through novel benchmarks and frameworks, often analyzing model performance under incomplete information or with limited data, and exploring the role of factors like data quality and model architecture (e.g., transformers, state space models). This research is crucial for responsible AI development, informing the creation of more robust and reliable systems with applications ranging from robotics and software engineering to education and scientific research.
Papers
Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners
Michael Vaccaro, Mikayla Friday, Arash Zaghi
Exploring Capability-Based Control Distributions of Human-Robot Teams Through Capability Deltas: Formalization and Implications
Nils Mandischer, Marcel Usai, Frank Flemisch, Lars Mikelsons
CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Shengye Wan, Cyrus Nikolaidis, Daniel Song, David Molnar, James Crnkovich, Jayson Grace, Manish Bhatt, Sahana Chennabasappa, Spencer Whitman, Stephanie Ding, Vlad Ionescu, Yue Li, Joshua Saxe
Using LLMs to Establish Implicit User Sentiment of Software Desirability
Sherri Weitl-Harms, John D. Hastings, Jonah Lum