Pre Trained
Pre-trained models represent a cornerstone of modern machine learning, aiming to leverage the knowledge learned from massive datasets to improve efficiency and performance on downstream tasks. Current research focuses on adapting these pre-trained models to diverse modalities (e.g., vision, language, audio) and tasks, often employing transformer-based architectures and techniques like transfer learning, parameter-efficient fine-tuning, and contrastive learning. This approach significantly reduces the need for large, task-specific datasets and computational resources, accelerating progress in various fields including medical image analysis, speech recognition, and natural language processing. The resulting improvements in accuracy, efficiency, and generalizability have broad implications for both scientific discovery and practical applications.
Papers
AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features
Ruochen Zhang, Hyeung-Sik Choi, Dongwook Jung, Phan Huy Nam Anh, Sang-Ki Jeong, Zihao Zhu
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
Shubhr Singh, Emmanouil Benetos, Huy Phan, Dan Stowell
Enhancing Transfer Learning for Medical Image Classification with SMOTE: A Comparative Study
Md. Zehan Alam, Tonmoy Roy, H.M. Nahid Kawsar, Iffat Rimi
YAD: Leveraging T5 for Improved Automatic Diacritization of Yorùbá Text
Akindele Michael Olawole, Jesujoba O. Alabi, Aderonke Busayo Sakpere, David I. Adelani
VisTabNet: Adapting Vision Transformers for Tabular Data
Witold Wydmański, Ulvi Movsum-zada, Jacek Tabor, Marek Śmieja
Analysis of Transferred Pre-Trained Deep Convolution Neural Networks in Breast Masses Recognition
Qusay Shihab Hamad, Hussein Samma, Shahrel Azmin Suandi
Personalized Large Vision-Language Models
Chau Pham, Hoang Phan, David Doermann, Yunjie Tian
Towards Foundation Models on Graphs: An Analysis on Cross-Dataset Transfer of Pretrained GNNs
Fabrizio Frasca, Fabian Jogl, Moshe Eliasof, Matan Ostrovsky, Carola-Bibiane Schönlieb, Thomas Gärtner, Haggai Maron
NILE: Internal Consistency Alignment in Large Language Models
Minda Hu, Qiyuan Zhang, Yufei Wang, Bowei He, Hongru Wang, Jingyan Zhou, Liangyou Li, Yasheng Wang, Chen Ma, Irwin King
Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition
Keqi Deng, Jinxi Guo, Yingyi Ma, Niko Moritz, Philip C. Woodland, Ozlem Kalinli, Mike Seltzer
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu, Chenyu Zhang, Zhe Chen, Jiawen Liu, Wenhai Wang, Lewei Lu, Gao Huang, Yu Qiao, Jifeng Dai
On the Suitability of pre-trained foundational LLMs for Analysis in German Legal Education
Lorenz Wendlinger, Christian Braun, Abdullah Al Zubaer, Simon Alexander Nonn, Sarah Großkopf, Christofer Fellicious, Michael Granitzer