How to Lie with Statistics: Things To Keep in Mind While Evaluating a Deep Learning Claim
2019-08-14
How to Lie with Statistics: Things To Keep in Mind While Evaluating a Deep Learning Claim
TEACHING POINTS
In today's age of deep learning and artificial intelligence, a radiologist must know what to watch out for while evaluating a deep learning algorithm's claim
What is ground truth?
Specific points to keep in mind while evaluating:
What is the medium of communication? Is it a video, a pre-print or a reputed peer-reviewed journal article?
What is the performance metric? Accuracy is a bad metric to use.
What data was the algorithm developed on? Generally, algorithms developed on poor ground truth have poor performance
What data was the algorithm validated on? Generally, algorithms validated on data from the same institution from where training data was obtained tend to falsely perform better
How much data was it tested on? Test data should not only be independent, but also adequate, both in number and disease heterogeneity.
What are the implications of the algorithm failing - what if a chest X-Ray algorithm misses a critical finding?
Try to get access to the actual algorithm and run it in your department
TABLE OF CONTENTS/OUTLINE
Why should a radiologist know how to evaluate a deep learning algorithm?
Performance metrics for evaluating algorithms
Data - training and testing
When an algorithm fails - implications
Run AI in your department!
Unlock the potential of CARPL platform for optimizing radiology workflows