Knowledge Distillation

Knowledge distillation is a machine learning technique that transfers knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, aiming to improve the student's performance and reduce computational costs. Current research focuses on improving distillation methods for various model architectures, including convolutional neural networks, transformers, and large language models, often incorporating techniques like parameter-efficient fine-tuning, multi-task learning, and data augmentation to enhance knowledge transfer. This approach is significant because it enables the deployment of high-performing models on resource-constrained devices and addresses challenges related to model size, training time, and privacy in diverse applications such as image captioning, speech processing, and medical diagnosis.

1123papers

Papers - Page 49

March 4, 2023

IKD+: Reliable Low Complexity Deep Models For Retinopathy Classification
Shreyas Bhat Brahmavar, Rohit Rajesh, Tirtharaj Dash, Lovekesh Vig, Tanmay Tulsidas Verlekar, Md Mahmudul Hasan, Tariq Khan, Erik Meijering+1
Deep Neural Network Performing Model Full Model Low Complexity Knowledge Distillation

March 2, 2023

March 1, 2023

February 28, 2023

February 27, 2023

February 23, 2023

February 22, 2023

February 21, 2023

February 19, 2023

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Chen Liang, Haoming Jiang, Zheng Li, Xianfeng Tang, Bin Yin, Tuo Zhao
Task Agnostic Distillation Knowledge Distillation Pre Trained Transformer Pre Trained Language Model

February 16, 2023

Knowledge Distillation

Papers - Page 49

IKD+: Reliable Low Complexity Deep Models For Retinopathy Classification

Letz Translate: Low-Resource Machine Translation for Luxembourgish

Distillation from Heterogeneous Models for Top-K Recommendation

Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification

Towards domain generalisation in ASR with elitist sampling and ensemble knowledge distillation

Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning

Generic-to-Specific Distillation of Masked Autoencoders

Language-Universal Adapter Learning with Knowledge Distillation for End-to-End Multilingual Speech Recognition

Leveraging Angular Distributions for Improved Knowledge Distillation

Graph-based Knowledge Distillation: A survey and experimental evaluation

Practical Knowledge Distillation: Using DNNs to Beat DNNs

A Neural Span-Based Continual Named Entity Recognition Model

Personalized Decentralized Federated Learning with Knowledge Distillation

Distilling Calibrated Student from an Uncalibrated Teacher

Debiased Distillation by Transplanting the Last Layer

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers

Two-in-one Knowledge Distillation for Efficient Facial Forgery Detection

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with Knowledge Distillation

Learning From Biased Soft Labels