Recognition Rate
Recognition rate, the accuracy of correctly identifying objects or patterns, is a central theme across diverse fields, from biometric security to image analysis. Current research focuses on improving recognition rates through advanced deep learning architectures like Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), and recurrent models, often incorporating techniques like transfer learning, multi-modal fusion, and generative models to enhance performance, particularly in challenging scenarios such as low-resolution images or noisy data. These advancements have significant implications for various applications, including automated surveillance, medical diagnosis, and human-computer interaction, by enabling more reliable and efficient systems.
Papers
Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents
Gavin Greif, Niclas Griesshaber, Robin GreifUniversity of Oxford●University of MannheimSpatiotemporal Attention Learning Framework for Event-Driven Object Recognition
Tiantian Xie, Pengpai Wang, Rosa H. M. ChanCity University of Hong Kong
BeMERC: Behavior-Aware MLLM-based Framework for Multimodal Emotion Recognition in Conversation
Yumeng Fu, Junjie Wu, Zhongjie Wang, Meishan Zhang, Yulin Wu, Bingquan LiuTexture or Semantics? Vision-Language Models Get Lost in Font Recognition
Zhecheng Li, Guoxian Song, Yujun Cai, Zhen Xiong, Junsong Yuan, Yiwei WangUniversity of California●ByteDance●The University of Queensland●University of Southern California●University at Buffalo
GatedxLSTM: A Multimodal Affective Computing Approach for Emotion Recognition in Conversations
Yupei Li, Qiyang Sun, Sunil Munthumoduku Krishna Murthy, Emran Alturki, Björn W. SchullerImperial College London●Technical University of Munich●relAI – the Konrad Zuse School of Excellence in Reliable AI●Munich Data Science Institute...+2Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data
Masoumeh Sharafi, Emma Ollivier, Muhammad Osama Zeeshan, Soufiane Belharbi, Marco Pedersoli, Alessandro Lameiras Koerich, Simon Bacon+1ETS Montreal●ETS Montreal●Concordia University●CIUSSS Nord-de-l’Ile-de-MontréalSynthetic Data Augmentation for Cross-domain Implicit Discourse Relation Recognition
Frances Yung, Varsha Suresh, Zaynab Reza, Mansoor Ahmad, Vera DembergSaarland University
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Yifei Zhang, Chang Liu, Jin Wei, Xiaomeng Yang, Yu Zhou, Can Ma, Xiangyang JiChinese Academy of Sciences●Nankai University●University of Chinese Academy of Sciences●Tsinghua University●Lenovo Research●Nor...+1PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula Recognition
Hongen Liu, Cheng Cui, Yuning Du, Yi Liu, Gang PanBaidu Inc.●Tianjin University
Neural Edge Histogram Descriptors for Underwater Acoustic Target Recognition
Atharva Agashe, Davelle Carreiro, Alexandra Van Dine, Joshua PeeplesTexas A&M University●Massachusetts Institute of Technology Lincoln LaboratoryTowards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas, Efstathia Soufleri, Arani Roy, Kaushik RoyPurdue University