Fine Grained
Fine-grained analysis focuses on achieving high precision and detail in various domains, moving beyond coarse-grained classifications. Current research emphasizes developing models capable of handling nuanced distinctions, often employing techniques like multi-modal learning, transformer architectures, and diffusion models to achieve this fine-grained understanding in tasks ranging from image captioning and object detection to legal analysis and speech processing. This detailed level of analysis is crucial for advancing fields like medical diagnosis, legal technology, and scientific discovery, enabling more accurate and insightful interpretations of complex data. The development of robust and efficient fine-grained models is driving progress across numerous scientific and practical applications.
Papers
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
Hongyu Li, Jinyu Chen, Ziyu Wei, Shaofei Huang, Tianrui Hui, Jialin Gao, Xiaoming Wei, Si Liu
Benchmarking Multimodal Models for Fine-Grained Image Analysis: A Comparative Study Across Diverse Visual Features
Evgenii Evstafev
Audio-visual Deepfake Detection With Local Temporal Inconsistencies
Marcella Astrid, Enjie Ghorbel, Djamila Aouada
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding
Liping Yuan, Jiawei Wang, Haomiao Sun, Yuchen Zhang, Yuan Lin
CULTURE3D: Cultural Landmarks and Terrain Dataset for 3D Applications
Xinyi Zheng, Steve Zhang, Weizhe Lin, Aaron Zhang, Walterio W. Mayol-Cuevas, Junxiao Shen
Local Foreground Selection aware Attentive Feature Reconstruction for few-shot fine-grained plant species classification
Aisha Zulfiqar, Ebroul Izquiedro
RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models
Keyan Chen, Jiafan Zhang, Chenyang Liu, Zhengxia Zou, Zhenwei Shi
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
Ji Soo Lee, Jongha Kim, Jeehye Na, Jinyoung Park, Hyunwoo J. Kim
Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation
Zhenyang Feng, Zihe Wang, Saul Ibaven Bueno, Tomasz Frelek, Advikaa Ramesh, Jingyan Bai, Lemeng Wang, Zanming Huang, Jianyang Gu, Jinsu Yoo, Tai-Yu Pan, Arpita Chowdhury, Michelle Ramirez, Elizabeth G. Campolongo, Matthew J. Thompson, Christopher G. Lawrence, Sydne Record, Neil Rosser, Anuj Karpatne, Daniel Rubenstein, Hilmar Lapp, Charles V. Stewart, Tanya Berger-Wolf, Yu Su, Wei-Lun Chao
Hierarchical Divide-and-Conquer for Fine-Grained Alignment in LLM-Based Medical Evaluation
Shunfan Zheng, Xiechi Zhang, Gerard de Melo, Xiaoling Wang, Linlin Wang
Natural Language Supervision for Low-light Image Enhancement
Jiahui Tang, Kaihua Zhou, Zhijian Luo, Yueen Hou
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs
Shan Zhang, Aotian Chen, Yanpeng Sun, Jindong Gu, Yi-Yu Zheng, Piotr Koniusz, Kai Zou, Anton van den Hengel, Yuan Xue