Pre Trained Large Language Model
Pre-trained large language models (LLMs) are massive neural networks trained on vast text datasets, aiming to generate human-quality text and perform various language tasks. Current research focuses on improving their efficiency (e.g., through parameter-efficient fine-tuning, low-rank decomposition, and federated learning), addressing privacy concerns (e.g., via federated learning and proxy tuning), and enhancing their capabilities for specific domains (e.g., via knowledge injection and prompt engineering). These advancements are significant because they increase the accessibility and applicability of LLMs across diverse fields, from biomedical question answering to time series analysis, while simultaneously mitigating computational and ethical challenges.
Papers
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia
On the Relationship between Skill Neurons and Robustness in Prompt Tuning
Leon Ackermann, Xenia Ohmer
Focal Inferential Infusion Coupled with Tractable Density Discrimination for Implicit Hate Speech Detection
Sarah Masud, Ashutosh Bajpai, Tanmoy Chakraborty