Paper ID: 2407.17181

Trans2Unet: Neural fusion for Nuclei Semantic Segmentation

Dinh-Phu Tran, Quoc-Anh Nguyen, Van-Truong Pham, Thi-Thao Tran

Nuclei segmentation, despite its fundamental role in histopathological image analysis, is still a challenge work. The main challenge of this task is the existence of overlapping areas, which makes separating independent nuclei more complicated. In this paper, we propose a new two-branch architecture by combining the Unet and TransUnet networks for nuclei segmentation task. In the proposed architecture, namely Trans2Unet, the input image is first sent into the Unet branch whose the last convolution layer is removed. This branch makes the network combine features from different spatial regions of the input image and localizes more precisely the regions of interest. The input image is also fed into the second branch. In the second branch, which is called TransUnet branch, the input image will be divided into patches of images. With Vision transformer (ViT) in architecture, TransUnet can serve as a powerful encoder for medical image segmentation tasks and enhance image details by recovering localized spatial information. To boost up Trans2Unet efficiency and performance, we proposed to infuse TransUnet with a computational-efficient variation called "Waterfall" Atrous Spatial Pooling with Skip Connection (WASP-KC) module, which is inspired by the "Waterfall" Atrous Spatial Pooling (WASP) module. Experiment results on the 2018 Data Science Bowl benchmark show the effectiveness and performance of the proposed architecture while compared with previous segmentation models.

Submitted: Jul 24, 2024