Gumbel Max Trick

The Gumbel-Max trick is a computational technique leveraging the Gumbel distribution to efficiently sample from categorical distributions, particularly useful when dealing with high-dimensional data or numerous samples. Current research focuses on applying this trick to enhance various machine learning models, including concept bottleneck models for explainable AI, and large language models for watermarking and improved decoding. These applications aim to improve model accuracy, robustness, and interpretability, impacting fields like natural language processing and computer vision.

Papers