Token Level Gradient

Token-level gradient analysis examines the gradients associated with individual tokens within neural networks, primarily focusing on improving model efficiency, fairness, and robustness. Current research explores applications in diverse areas, including multilingual language modeling (optimizing subword tokenization), large vision-language models (mitigating optimization conflicts within Mixture-of-Experts architectures), and adversarial attacks on vision transformers (enhancing robustness through gradient regularization). These investigations aim to enhance model performance, reduce computational costs, and address biases in various downstream tasks, ultimately contributing to the development of more efficient and reliable AI systems.

Papers