Paper ID: 2305.10116

Can Deep Learning Reliably Recognize Abnormality Patterns on Chest X-rays? A Multi-Reader Study Examining One Month of AI Implementation in Everyday Radiology Clinical Practice

Daniel Kvak, Anna Chromcová, Petra Ovesná, Jakub Dandár, Marek Biroš, Robert Hrubý, Daniel Dufek, Marija Pajdaković

In this study, we developed a deep-learning-based automatic detection algorithm (DLAD, Carebot AI CXR) to detect and localize seven specific radiological findings (atelectasis (ATE), consolidation (CON), pleural effusion (EFF), pulmonary lesion (LES), subcutaneous emphysema (SCE), cardiomegaly (CMG), pneumothorax (PNO)) on chest X-rays (CXR). We collected 956 CXRs and compared the performance of the DLAD with that of six individual radiologists who assessed the images in a hospital setting. The proposed DLAD achieved high sensitivity (ATE 1.000 (0.624-1.000), CON 0.864 (0.671-0.956), EFF 0.953 (0.887-0.983), LES 0.905 (0.715-0.978), SCE 1.000 (0.366-1.000), CMG 0.837 (0.711-0.917), PNO 0.875 (0.538-0.986)), even when compared to the radiologists (LOWEST: ATE 0.000 (0.000-0.376), CON 0.182 (0.070-0.382), EFF 0.400 (0.302-0.506), LES 0.238 (0.103-0.448), SCE 0.000 (0.000-0.634), CMG 0.347 (0.228-0.486), PNO 0.375 (0.134-0.691), HIGHEST: ATE 1.000 (0.624-1.000), CON 0.864 (0.671-0.956), EFF 0.953 (0.887-0.983), LES 0.667 (0.456-0.830), SCE 1.000 (0.366-1.000), CMG 0.980 (0.896-0.999), PNO 0.875 (0.538-0.986)). The findings of the study demonstrate that the suggested DLAD holds potential for integration into everyday clinical practice as a decision support system, effectively mitigating the false negative rate associated with junior and intermediate radiologists.

Submitted: May 17, 2023