Direct Assessment
Direct assessment encompasses a broad range of techniques for evaluating diverse systems and phenomena, from the psychological traits of language models to the precision of 3D models and the performance of autonomous vehicles. Current research focuses on developing robust and reliable assessment methods, often employing machine learning models like VQ-VAEs, various neural networks (including vision transformers and graph neural networks), and large language models (LLMs) for automated analysis and evaluation. These advancements are crucial for improving the trustworthiness and reliability of AI systems, enhancing diagnostic capabilities in healthcare, and optimizing performance in various engineering and scientific domains.
Papers
ER2Score: LLM-based Explainable and Customizable Metric for Assessing Radiology Reports with Reward-Control Loss
Yunyi Liu, Yingshu Li, Zhanyu Wang, Xinyu Liang, Lingqiao Liu, Lei Wang, Luping Zhou
Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment
Zheng Chen, Xun Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiongkuo Min, Xiaohong Liu, Xin Yuan, Yong Guo, Yulun Zhang
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment
Dongping Chen, Ruoxi Chen, Shu Pu, Zhaoyi Liu, Yanru Wu, Caixi Chen, Benlin Liu, Yue Huang, Yao Wan, Pan Zhou, Ranjay Krishna