Paper ID: 2408.01897
CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical Imagery
Zilin Chen, Shengnan Lu
Object detection is of paramount importance in biomedical image analysis, particularly for lesion identification. While current methodologies are proficient in identifying and pinpointing lesions, they often lack the precision needed to detect minute biomedical entities (e.g., abnormal cells, lung nodules smaller than 3 mm), which are critical in blood and lung pathology. To address this challenge, we propose CAF-YOLO, based on the YOLOv8 architecture, a nimble yet robust method for medical object detection that leverages the strengths of convolutional neural networks (CNNs) and transformers. To overcome the limitation of convolutional kernels, which have a constrained capacity to interact with distant information, we introduce an attention and convolution fusion module (ACFM). This module enhances the modeling of both global and local features, enabling the capture of long-term feature dependencies and spatial autocorrelation. Additionally, to improve the restricted single-scale feature aggregation inherent in feed-forward networks (FFN) within transformer architectures, we design a multi-scale neural network (MSNN). This network improves multi-scale information aggregation by extracting features across diverse scales. Experimental evaluations on widely used datasets, such as BCCD and LUNA16, validate the rationale and efficacy of CAF-YOLO. This methodology excels in detecting and precisely locating diverse and intricate micro-lesions within biomedical imagery. Our codes are available at https://github.com/xiaochen925/CAF-YOLO.
Submitted: Aug 4, 2024