Large Pre Trained Model

Large pre-trained models (LPMs) are massive neural networks trained on enormous datasets, aiming to achieve strong generalization across diverse downstream tasks with minimal further training. Current research emphasizes efficient fine-tuning techniques, such as prompt engineering, low-rank adaptation (e.g., LoRA, SVFit), and sparse parameter updates, to reduce computational costs and improve model adaptability while addressing issues like overfitting and catastrophic forgetting. This field is significant due to LPMs' transformative impact on various applications, from natural language processing and computer vision to robotics and education, driving advancements in both theoretical understanding and practical deployment of AI systems.

Papers

March 28, 2024

Model Stock: All we need is just a few fine-tuned models
Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han
Full Model Pre Trained Model Large Pre Trained Model Fine Tuned Model Fine Tuned Weight

March 21, 2024

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, Sai Qian Zhang
Comprehensive Survey Parameter Efficient Fine Tuning Large Model Large Pre Trained Model Computational Cost Computational Overhead

March 19, 2024

BiLoRA: A Bi-level Optimization Framework for Overfitting-Resilient Low-Rank Adaptation of Large Pre-trained Models
Rushi Qiang, Ruiyi Zhang, Pengtao Xie
Fine Tuning Adaptation Concern Large Pre Trained Model Large Scale Pre Trained Model Bi Level Optimization Fine Tuning Approach

March 11, 2024

March 7, 2024

A Survey on Human-AI Teaming with Large Pre-Trained Models
Vanshika Vats, Marzia Binta Nizam, Minghao Liu, Ziyuan Wang, Richard Ho, Mohnish Sai Prasad, Vincent Titterton, Sai Venkat Malreddy, Riya Aggarwal, Yanwen Xu, Lei Ding, Jay Mehta, Nathan Grinnell, Li Liu, Sijia Zhong, Devanathan Nallur Gandamani, Xinyi Tang, Rohan Ghosalkar, Celeste Shen, Rachel Shen, Nafisa Hussain, Kesav Ravichandran, James Davis
Artificial Intelligence Timely Survey Artificial Intelligence System Large Pre Trained Model Human AI Collaborative Intelligence

March 1, 2024

Fine-tuning with Very Large Dropout
Jianyu Zhang, Léon Bottou
Fine Tuning Deep Network Large Pre Trained Model Structured Dropout High Quality Representation Model Soup Distribution Performance

February 29, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Jiantao Qiu, Haijun Lv, Zhenjiang Jin, Rui Wang, Wenchang Ning, Jia Yu, ChaoBin Zhang, Zhenxiang Li, Pei Chu, Yuan Qu, Jin Shi, Lindong Lu, Runyu Peng, Zhiyuan Zeng, Huanze Tang, Zhikai Lei, Jiawei Hong, Keyu Chen, Zhaoye Fei, Ruiliang Xu, Wei Li, Zhongying Tu, Lin Dahua, Yu Qiao, Hang Yan, Conghui He
Language Model Large Pre Trained Model High Quality Data

February 28, 2024

HOP to the Next Tasks and Domains for Continual Learning in NLP
Umberto Michieli, Mete Ozay
Continual LEArning NLP Field Natural Language Processing Task New Task Large Pre Trained Model Domain Name NLP Application Basin Hopping Different Approach

February 6, 2024

OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning
Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsu
Class Incremental Learning Large Pre Trained Model Rehearsal Based Outlier Synthesis Rehearsal Free Class Incremental Learning

January 8, 2024

January 2, 2024

Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training
Jiuming Qin, Che Liu, Sibo Cheng, Yike Guo, Rossella Arcucci
Vision Language Model Pre Trained Large Pre Trained Model Light Weighed Backbone Medical Image Representation

December 23, 2023

INFAMOUS-NeRF: ImproviNg FAce MOdeling Using Semantically-Aligned Hypernetworks with Neural Radiance Fields
Andrew Hou, Feng Liu, Zhiyuan Ren, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu
Neural Radiance Field Large Pre Trained Model NeRF SLAM Facial Prior Morphable Face Model

November 29, 2023

Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines
Hamed Damirchi, Cristian Rodríguez-Opazo, Ehsan Abbasnejad, Damien Teney, Javen Qinfeng Shi, Stephen Gould, Anton van den Hengel
Pre Trained Model Retrieval Augmented Large Pre Trained Model Multi Modal Learning Search Engine Zero Shot Retrieval Robust Retrieval

November 21, 2023

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks
Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger
Fine Tuning Mixed Effect New Task Large Pre Trained Model Hidden Knowledge Model Capability

November 19, 2023

Large Pre-trained time series models for cross-domain Time series analysis tasks
Harshavardhan Kamarthi, B. Aditya Prakash
Time Series Large Pre Trained Model Sequential Model General Time Series

November 15, 2023

End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions
Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li
Deep Neural Network Timely Survey NCD Method Future Direction Dialogue Utterance New Task Task Oriented Large Pre Trained Model Downstream Dialogue Task

November 9, 2023

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks
Haoyi Duan, Yan Xia, Mingze Zhou, Li Tang, Jieming Zhu, Zhou Zhao
Audio Visual Large Pre Trained Model Cross Modal Interaction Single Modality Multi Modal PromPt Audio Visual Task

Large Pre Trained Model

Papers

Model Stock: All we need is just a few fine-tuned models

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

BiLoRA: A Bi-level Optimization Framework for Overfitting-Resilient Low-Rank Adaptation of Large Pre-trained Models

Semantic Residual Prompts for Continual Learning

Learning with Noisy Foundation Models

A Segmentation Foundation Model for Diverse-type Tumors

A Survey on Human-AI Teaming with Large Pre-Trained Models

Fine-tuning with Very Large Dropout

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset

HOP to the Next Tasks and Domains for Continual Learning in NLP

OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental Learning

Empirical Analysis of Efficient Fine-Tuning Methods for Large Pre-Trained Language Models

Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training

INFAMOUS-NeRF: ImproviNg FAce MOdeling Using Semantically-Aligned Hypernetworks with Neural Radiance Fields

Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Large Pre-trained time series models for cross-domain Time series analysis tasks

End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks