Audio Captioning
Audio captioning aims to automatically generate natural language descriptions of audio content, bridging the gap between audio and text modalities. Current research focuses on improving caption quality, diversity, and efficiency through advancements in model architectures like diffusion models and transformers, often incorporating large language models for improved semantic understanding and evaluation. This field is significant for advancing audio understanding and multimedia applications, with ongoing efforts to address challenges such as data scarcity, evaluation metric limitations, and the development of more robust and generalizable models.
Papers
October 28, 2022
October 10, 2022
October 3, 2022
September 28, 2022
September 20, 2022
August 12, 2022
July 8, 2022
June 4, 2022
May 12, 2022
May 11, 2022
April 18, 2022
April 1, 2022
March 31, 2022
March 29, 2022
March 6, 2022
February 3, 2022
January 28, 2022