Meet us at World Health Expo, Dubai | 9-12 Feb, 2026
  • 2026-01-15

Comparison of AI Tool Performance Against Human Radiologists in 2D Mammography: A Look into Key Error Categories

Objectives:

To evaluate and compare the performance of an AI tool against human radiologists in detecting cancer on 2D mammography, focusing on key error categories such as false positives (FP) and false negatives (FN) using a curated dataset.

Materials and Methods:
A curated dataset was used to address the under-representation of key error cases and bias toward benign cases found in real-world data. Two groups - Human Reader and AI Reader - were evaluated against a predefined ground truth, established as “benign” (based on histology or no cancer detected for more than 2 years) and “malignant” (confirmed by histology). The dataset consisted of 240 mammography cases, including 120 benign and 120 cancer cases, ensuring equal proportions of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) based on ground truth for human readers.

Results:
Performance was equivalent between human radiologists and the AI tool in 60% of cases (144 concordant cases). Among concordant cases, both AI and human radiologists were correct in 45% cases: 22.9% (55) being TP, and 22.1% (53) being TN. Both were incorrect in 15% cases: 7.9% (19) being FN, and 7.1% (17) being FP. In discordant cases, 17.1% (41) were missed by human radiologists but identified by AI. Human readers overcalled 17.9% (43) cases as positive, while AI correctly classified them as benign. AI missed 2.1% of malignant cases and misclassified 2.9% of benign cases as malignant. Overall, the AI tool outperformed humans in 35% of cases, while humans performed better in only 5% of cases.

Conclusion:
The AI tool performs equally or better than human radiologists in key error categories, including false positives and false negatives, demonstrating its potential for supporting mammography diagnosis.

Unlock the potential of CARPL platform for optimizing radiology workflows

Talk to a Clinical Solutions Architect