Direct Assessment

Direct assessment encompasses a broad range of techniques for evaluating diverse systems and phenomena, from the psychological traits of language models to the precision of 3D models and the performance of autonomous vehicles. Current research focuses on developing robust and reliable assessment methods, often employing machine learning models like VQ-VAEs, various neural networks (including vision transformers and graph neural networks), and large language models (LLMs) for automated analysis and evaluation. These advancements are crucial for improving the trustworthiness and reliability of AI systems, enhancing diagnostic capabilities in healthcare, and optimizing performance in various engineering and scientific domains.

Papers

April 8, 2024

360$^\circ$REA: Towards A Reusable Experience Accumulation with 360{\deg} Assessment for Multi-Agent System
Shen Gao, Hao Li, Chengrui Huang, Quan Tu, Zhiliang Tian, Minlie Huang, Shuo Shang
Large Language Model Fine Grained Multi Agent System Direct Assessment Experience Pool

April 4, 2024

Uncertainty in Language Models: Assessment through Rank-Calibration
Xinmeng Huang, Shuo Li, Mengxin Yu, Matteo Sesia, Hamed Hassani, Insup Lee, Osbert Bastani, Edgar Dobriban
Language Model Language Generation High Uncertainty Anticipation Direct Assessment Confidence Measure Semantic Entropy

March 26, 2024

Assessment of Multimodal Large Language Models in Alignment with Human Values
Zhelun Shi, Zhipin Wang, Hongxing Fan, Zaibin Zhang, Lijun Li, Yongting Zhang, Zhenfei Yin, Lu Sheng, Yu Qiao, Jing Shao
Multimodal Large Language Model Alignment Problem Direct Assessment Human Annotated Human Value Evaluation Datasets

March 15, 2024

The AI Assessment Scale (AIAS) in action: A pilot implementation of GenAI supported assessment
Leon Furze, Mike Perkins, Jasper Roe, Jason MacVaugh
Artificial Intelligence Generative Artificial Intelligence Direct Assessment Pilot Study GenAI Integration Educational Assessment Assessment Scale Assessment Design

February 22, 2024

SpanSeq: Similarity-based sequence data splitting method for improved development and assessment of deep learning projects
Alfred Ferrer Florensa, Jose Juan Almagro Armenteros, Henrik Nielsen, Frank Møller Aarestrup, Philip Thomas Lanken Conradsen Clausen
Development Activity Direct Assessment Biological Sequence Computational Biology Microbial Genome Data Splitting Span Identification

February 15, 2024

February 5, 2024

A Computational Model for the Assessment of Mutual Intelligibility Among Closely Related Languages
Jessica Nieder, Johann-Mattis List
Direct Assessment Language Pair Computational Model Different Language Language Similarity Psycholinguistic Experiment Mutual Intelligibility

January 28, 2024

Assessment of Autism and ADHD: A Comparative Analysis of Drawing Velocity Profiles and the NEPSY Test
S. Fortea-Sevilla, A. Garcia-Sosa., P. Morales-Almeida, C. Carmona-Duarte
Direct Assessment Autism Spectrum Disorder Attention Deficit Hyperactivity Disorder

January 23, 2024

Assessment of Sports Concussion in Female Athletes: A Role for Neuroinformatics?
Rachel Edelstein, Sterling Gutterman, Benjamin Newman, John Darrell Van Horn
Integral Role Direct Assessment Computational Neuroscience Brain Structure Gender Difference Sport Related Concussion

January 9, 2024

An Assessment on Comprehending Mental Health through Large Language Models
Mihael Arcan, David-Paul Niland, Fionn Delahunty
Large Language Model Deep Learning Model Natural Language Mental Health Direct Assessment

December 21, 2023

D-STGCNT: A Dense Spatio-Temporal Graph Conv-GRU Network based on transformer for assessment of patient physical rehabilitation
Youssef Mourchid, Rim Slama
Transformer Based Direct Assessment Physical Rehabilitation Rehabilitation Exercise Skeleton Dataset

December 19, 2023

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment
Lingling Xu, Haoran Xie, Si-Zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
Large Language Model Language Model Parameter Efficient Fine Tuning Pretrained Language Model Direct Assessment Critical Review Many Natural Language Processing

December 13, 2023

ConFormer: A Novel Collection of Deep Learning Models to Assist Cardiologists in the Assessment of Cardiac Function
Ethan Thomas, Salman Aslam
Deep Learning Model Direct Assessment Heart Rate One Pas Multiple Conformer Ejection Fraction Cardiac Function

December 8, 2023

Development and Assessment of Autonomous Vehicles in Both Fully Automated and Mixed Traffic Conditions
Ahmed Abdelrahman
Autonomous Vehicle Development Activity Direct Assessment Driver Behavior Mixed Traffic Vehicle 2 Vehicle

December 2, 2023

Kattis vs. ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence
Nora Dunder, Saga Lundborg, Olga Viberg, Jacqueline Wong
Artificial Intelligence Global Evaluation Generative AI ChatGPT Generated Conversation Direct Assessment Programming Task Computer Science Programming Education Computer Science Education

November 27, 2023

chatGPT for generating questions and assessments based on accreditations
Rania Anwar Aboalela
Generative AI ChatGPT Generated Conversation Generative Artificial Intelligence Yes No Question Direct Assessment Artificial Intelligence Technique Exam Paper Generation

November 23, 2023

Assessment of Deep Learning Segmentation for Real-Time Free-Breathing Cardiac Magnetic Resonance Imaging at Rest and Under Exercise Stress
Martin Schilling, Christina Unterberg-Buchwald, Joachim Lotz, Martin Uecker
Direct Assessment Cardiac Magnetic Resonance Human Stress Rest RESTAD NAP

November 15, 2023

Direct Assessment

Papers

360$^\circ$REA: Towards A Reusable Experience Accumulation with 360{\deg} Assessment for Multi-Agent System

Uncertainty in Language Models: Assessment through Rank-Calibration

Assessment of Multimodal Large Language Models in Alignment with Human Values

The AI Assessment Scale (AIAS) in action: A pilot implementation of GenAI supported assessment

SpanSeq: Similarity-based sequence data splitting method for improved development and assessment of deep learning projects

ViGEO: an Assessment of Vision GNNs in Earth Observation

GPT-4's assessment of its performance in a USMLE-based case study

A Computational Model for the Assessment of Mutual Intelligibility Among Closely Related Languages

Assessment of Autism and ADHD: A Comparative Analysis of Drawing Velocity Profiles and the NEPSY Test

Assessment of Sports Concussion in Female Athletes: A Role for Neuroinformatics?

An Assessment on Comprehending Mental Health through Large Language Models

D-STGCNT: A Dense Spatio-Temporal Graph Conv-GRU Network based on transformer for assessment of patient physical rehabilitation

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

ConFormer: A Novel Collection of Deep Learning Models to Assist Cardiologists in the Assessment of Cardiac Function

Development and Assessment of Autonomous Vehicles in Both Fully Automated and Mixed Traffic Conditions

Kattis vs. ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence

chatGPT for generating questions and assessments based on accreditations

Assessment of Deep Learning Segmentation for Real-Time Free-Breathing Cardiac Magnetic Resonance Imaging at Rest and Under Exercise Stress

How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities

DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack