CUDA Kernel
CUDA kernels are highly optimized functions executed on NVIDIA GPUs, aiming to maximize the performance of computationally intensive tasks. Current research focuses on improving kernel efficiency for large language models (LLMs), neural network inference, and other demanding applications, often employing techniques like memory optimization, instruction scheduling, and quantization. These advancements lead to significant speedups and reduced resource consumption, impacting fields like AI, scientific computing, and graphics rendering by enabling faster training and inference of complex models.
Papers
September 10, 2024
May 23, 2024
March 25, 2024
February 15, 2024
December 19, 2023
October 28, 2023
October 9, 2023
June 25, 2023
April 25, 2023
August 30, 2022
August 28, 2022
March 22, 2022
March 8, 2022