Speeding up pediatric AI CXR Validation at DASA, Brazil

Speeding up pediatric AI CXR Validation at DASA, Brazil

About the Customer

DASA, headquartered in Sao Paolo, Brazil, is the 4th largest diagnostics company in the world and the largest in Brazil. They are unique in that they are the both, the largest in-vitro and radiology diagnostics company in Brazil. They invest heavily in cutting edge technologies and have research collaborations with institutes such as Harvard and Stanford Universities.

The Pain Point

Being one of the world’s largest medical imaging service providers, and given the innovation DNA of the organisation itself, radiologists at DASA are constantly evaluating newer tools to help them in their day to day clinical practice. Needless to say, many of these tools are Artificial Intelligence (AI) based products which automate parts of the radiology workflow.

One such tool is QXR from Qure.ai (India). Qure.ai is one of the world’s leading medical imaging AI companies building algorithms that automate the reporting of Chest X-Rays and Head CT scans. QXR is an AI system that automatically reports Chest X-Rays, and even discerns normal from abnormal. In early 2020, Qure.ai developed a new version of QXR which could do normal-vs-abnormal classification for pediatric Chest X-Rays, something that has always been a challenge for AI in general.

How does DASA evaluate Qure.ai’s pediatric X-ray algorithm in a simple yet statistically and clinically significant manner without allocating significant resources or budget?

CARPL Impact

DASA runs the radiology at the Sabara Children’s Hospital in Sao Paolo, Brazil. This is one of the leading pediatric hospitals in Brazil and also one of the few with a dedicated pediatric radiology department. Dr Marcelo Straus Takahashi, Radiologist at Sabara Children’s Hospital used CARPL to test Qure.ai’s pediatric X-ray algorithm on ~3,300 Chest X-rays for its ability to discern normal from abnormal.

CARPL makes this process very easy:

  • Load the X-rays onto CARPL using CARPL’s front-end
  • Run Qure.ai’s algorithm on those X-rays (in this case, to speed up the process, the X-rays were sent to Qure.ai’s API). Qure.ai’s algorithm gave an output of:
    • Normal
    • Abnormal
    • To be read (where the AI is ‘not sure’)
  • Load the normal vs abnormal status of each X-ray onto CARPL (this is a simple csv upload)

On CARPL, the following analysis was obtained for those cases which were labeled as normal or abnormal by the AI:

An astoundingly high AUC of 0.98 was obtained, and at a threshold of 50, there was only one false negative and 50 false positives. Note that the false positives would have been read by a radiologist any way, avoiding any error.

Upon digging deeper into the one false negative case, it was noted that it was clinically insignificant and hence the AI was indeed correct:

 

Outcome

CARPL allowed Dr Takahashi to quickly and effectively test Qure.ai’s QXR solution on a pediatric population without requiring additional assistance from a statistics or data science team. Additionally he was able to deep dive into the false positive and false negative cases to see whether the errors were true errors or not, and taking a more informed decision on the performance of the algorithm. From Qure.ai’s point of view, they were able to run a retrospective validation study on a new version of their algorithm, in a true independent test setting, without any effort whatsoever.

This work was presented at RSNA 2020 by Dr Takahashi as an example of successful validation of an AI algorithm on pediatric Chest X-rays

Real-time validation of AI in production

Real-Time Validation Of AI In Production

Note: The images used below are for representational purposes only

About the Customer

Qure.ai is the world’s leading developers of Artificial Intelligence solutions for medical imaging and radiology applications. Pioneers in the field, they are amongst the most published AI research groups in the world having more than 30 publications and presentations in leading journals and conferences, including the first paper on AI in the prestigious journal – The Lancet.

 

The Pain Point

With tens of thousands of chest X-rays passing through the Qure.ai algorithms every day, it becomes critical for the Qure data science leadership to know the real-time performance of their algorithms across their user base. It is a well known fact that the performance of AI can vary dramatically based on patient ethnicity and equipment vendor characteristics – as an AI developer’s user-base scales, the likelihood of an error creeping through the system increases. The challenge becomes the orchestration of a mechanism where randomly picked Chest X-rays are double-read by a team of radiologists, labels established during these reads are compared against the AI outputs, and a dashboard is presented which contains real-time performance metrics (Area Under Curve, Sensitivity, Specificity etc.) with the ability to deep dive into the false positives / negatives.

How does the leadership team at Qure.ai create such a system without investing significant engineering effort?

CARPL Impact

CARPL’s Real-Time AI validation workflow allows AI developers to monitor the performance of their algorithms in real-time. Reshma Suresh, the Chief Operating Officer at Qure.ai, uses CARPL to get real-time ground truth inputs from radiologists and then compares the radiologist reads to the AI outputs, subsequently creating a real-time performance dashboard for QXR – Qure.ai’s Chest X-Ray AI algorithm.

CARPL makes the process very easy:

  • Create a Dataset on CARPL
  • Create a “Classification” Testing project on CARPL → choose the appropriate algorithm, i.e. QXR, and the Dataset created above
  • Create an Annotation Template on CARPL → create whatever fields that need to get annotated
  • Create an Annotation Project on CARPL → select the dataset and the annotation project created above
    • Link the annotation project to the Testing project
    • Select to “Auto-Assign” cases to radiologist(s)
  • Now as data keeps getting added to the Dataset, either manually or through CARPL’s APIs, the data gets inferred on the AI and gets assigned to the radiologist(s), and as the radiologist(s) keep reading the scans, the real-time dashboard keeps getting populated!

CARPL is deployed on Qure.ai’s infrastructure allowing Qure to take control of all the data that comes onto CARPL!

Example of a case which is otherwise normal, but was wrongly classified by AI as an abnormal case possibly due to poor image quality and a coiled naso-gastric tube in the eosophagus

Example of a case where the radiologist identified cardiomegaly

Representational Image of a Real-Time Validation Project on CARPL

Impact

While much has been spoken about monitoring of AI in clinical practice at the hospital level, it is even more important for AI developers themselves to monitor AI in real-time so that they may detect shifts in model performance, and intermediate as and when needed. This moves the process of AI monitoring and consequent improvement from a retrospective and post-facto process to a proactive approach. As we go and build on our vision to make CARPL the singular platform behind all clinically useful and successful AI, working with AI developers on helping them establish robust and seamless processes for monitoring of AI is key.

 

Anonymised chart to show a fall in AUC detected by real-time monitoring of AI by an AI developer – image for representational purposes only – not reflective of real world data.

We stay true to our mission of Bringing AI from Bench to Clinic.