Paper ID: 2404.16852

A Disease Labeler for Chinese Chest X-Ray Report Generation

Mengwei Wang, Ruixin Yan, Zeyi Hou, Ning Lang, Xiuzhuang Zhou

In the field of medical image analysis, the scarcity of Chinese chest X-ray report datasets has hindered the development of technology for generating Chinese chest X-ray reports. On one hand, the construction of a Chinese chest X-ray report dataset is limited by the time-consuming and costly process of accurate expert disease annotation. On the other hand, a single natural language generation metric is commonly used to evaluate the similarity between generated and ground-truth reports, while the clinical accuracy and effectiveness of the generated reports rely on an accurate disease labeler (classifier). To address the issues, this study proposes a disease labeler tailored for the generation of Chinese chest X-ray reports. This labeler leverages a dual BERT architecture to handle diagnostic reports and clinical information separately and constructs a hierarchical label learning algorithm based on the affiliation between diseases and body parts to enhance text classification performance. Utilizing this disease labeler, a Chinese chest X-ray report dataset comprising 51,262 report samples was established. Finally, experiments and analyses were conducted on a subset of expert-annotated Chinese chest X-ray reports, validating the effectiveness of the proposed disease labeler.

Submitted: Mar 18, 2024