Pre Trained
Pre-trained models represent a cornerstone of modern machine learning, aiming to leverage the knowledge learned from massive datasets to improve efficiency and performance on downstream tasks. Current research focuses on adapting these pre-trained models to diverse modalities (e.g., vision, language, audio) and tasks, often employing transformer-based architectures and techniques like transfer learning, parameter-efficient fine-tuning, and contrastive learning. This approach significantly reduces the need for large, task-specific datasets and computational resources, accelerating progress in various fields including medical image analysis, speech recognition, and natural language processing. The resulting improvements in accuracy, efficiency, and generalizability have broad implications for both scientific discovery and practical applications.
Papers
Transfer Learning for Passive Sonar Classification using Pre-trained Audio and ImageNet Models
Amirmohammad Mohammadi, Tejashri Kelhe, Davelle Carreiro, Alexandra Van Dine, Joshua Peeples
Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Tensorflow Pretrained Models
Keyu Chen, Ziqian Bi, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Ming Li, Xuanhe Pan, Jiawei Xu, Jinlang Wang, Pohsun Feng
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan, Xubo Liu, Haohe Liu, Mark D. Plumbley, Wenwu Wang
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
Xinhu Zheng, Anbai Jiang, Bing Han, Yanmin Qian, Pingyi Fan, Jia Liu, Wei-Qiang Zhang
Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training
Xingliang Lei, Yiwen Ye, Ziyang Chen, Minglei Shu, Yong Xia
NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework
Shuangchen Zhao, Changde Du, Hui Li, Huiguang He
RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images
Ziteng Cui, Tatsuya Harada
SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning
Qi Qian, Yuanhong Xu, Juhua Hu
ParGo: Bridging Vision-Language with Partial and Global Views
An-Lan Wang, Bin Shan, Wei Shi, Kun-Yu Lin, Xiang Fei, Guozhi Tang, Lei Liao, Jingqun Tang, Can Huang, Wei-Shi Zheng
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Le Xue, Manli Shu, Anas Awadalla, Jun Wang, An Yan, Senthil Purushwalkam, Honglu Zhou, Viraj Prabhu, Yutong Dai, Michael S Ryoo, Shrikant Kendre, Jieyu Zhang, Can Qin, Shu Zhang, Chia-Chih Chen, Ning Yu, Juntao Tan, Tulika Manoj Awalgaonkar, Shelby Heinecke, Huan Wang, Yejin Choi, Ludwig Schmidt, Zeyuan Chen, Silvio Savarese, Juan Carlos Niebles, Caiming Xiong, Ran Xu
FourierKAN outperforms MLP on Text Classification Head Fine-tuning
Abdullah Al Imran, Md Farhan Ishmam
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
Xiangdong Zhang, Shaofeng Zhang, Junchi Yan