Fine Grained
Fine-grained analysis focuses on achieving high precision and detail in various domains, moving beyond coarse-grained classifications. Current research emphasizes developing models capable of handling nuanced distinctions, often employing techniques like multi-modal learning, transformer architectures, and diffusion models to achieve this fine-grained understanding in tasks ranging from image captioning and object detection to legal analysis and speech processing. This detailed level of analysis is crucial for advancing fields like medical diagnosis, legal technology, and scientific discovery, enabling more accurate and insightful interpretations of complex data. The development of robust and efficient fine-grained models is driving progress across numerous scientific and practical applications.
Papers
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François Germain, Jonathan Le Roux, Shinji Watanabe
Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering
Weizhe Lin, Jinghong Chen, Jingbiao Mei, Alexandru Coca, Bill Byrne
Tactile Estimation of Extrinsic Contact Patch for Stable Placement
Kei Ota, Devesh K. Jha, Krishna Murthy Jatavallabhula, Asako Kanezaki, Joshua B. Tenenbaum
Species196: A One-Million Semi-supervised Dataset for Fine-grained Species Recognition
Wei He, Kai Han, Ying Nie, Chengcheng Wang, Yunhe Wang
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch
Pucheng Zhai, Kailing Guo, Fang Liu, Xiaofen Xing, Xiangmin Xu
IEBins: Iterative Elastic Bins for Monocular Depth Estimation
Shuwei Shao, Zhongcai Pei, Xingming Wu, Zhong Liu, Weihai Chen, Zhengguo Li
PARTICLE: Part Discovery and Contrastive Learning for Fine-grained Recognition
Oindrila Saha, Subhransu Maji
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs
Guangyao Zhai, Xiaoni Cai, Dianye Huang, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam
Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task
Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins
InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework
Shanglin Lei, Guanting Dong, Xiaoping Wang, Keheng Wang, Sirui Wang
FGFusion: Fine-Grained Lidar-Camera Fusion for 3D Object Detection
Zixuan Yin, Han Sun, Ningzhong Liu, Huiyu Zhou, Jiaquan Shen