Estimating AI-generated Bias in Radiology Reporting by Measuring the Change in the Kellgren-Lawrence Grades of Knee Arthritis Before and After Knowledge of AI Results—A Multi-reader Retrospective Study

Estimating AI-Generated Bias In Radiology Reporting By Measuring The Change In The Kellgren-Lawrence Grades Of Knee Arthritis Before And After Knowledge Of AI Results—A Multi-Reader Retrospective Study​

PURPOSE:

To estimate the extent of bias generated by AI in the radiologists’ reporting of grades of osteoarthritis on Knee X-rays by observing the change in grading after the knowledge of predictions of a deep learning algorithm.

METHOD AND MATERIALS:

Anteroposterior views of 271 knee x-rays (542 joints) were randomly extracted from PACS and anonymized.
These x-rays were analyzed using DeepKnee, an open-source algorithm based on the Deep Siamese CNN
architecture that automatically predicts the presence of osteoarthritis on Knee X Rays on a 5 scale Kellgren and
Lawrence system (KL) along with an attention map. These x-rays were independently read by three sub-specialist MSK radiologists on the CARPL AI research platform (CARING Research, India). The KL grade for each Xray was recorded by the radiologists, following which the AI algorithm grade was shown, and radiologists given the option to change their result. The pre-AI result and post-AI results were both recorded. The change in the scores of all three readers was calculated and modulus of change in the score was estimated using the
incongruence rate. The consensus shift before and after the knowledge of the AI results was also estimated.

RESULTS:

There were a total of 542 knee joints that were analyzed by the algorithm and read by the three radiologists giving total 1,626 “instances”. There were 139 instances (8.5%) of readers changing their results. The number of shifts was 13,44, 31, 32 & 19 for grades 0 to 4 respectively. The reader1, reader2, reader3 changed their estimations in 52 (single shift), 34 (single shift), 53 (50 single shift, 2 two shifts, 1 three shift). The intra-reader incongruence rates were 9.6%, 6.3% and 9.8 % respectively. The Krippendorff’s alpha among the readers before knowledge and after knowledge AI results was 0.84 & 0.87 implying minimal convergence towards AI results. Three-reader, two-reader, and no consensus were found in 219, 296, and 27 cases before and 248, 279, and 15 cases after knowledge of AI results (see Figure 1).


Figure 1

CONCLUSION:

We demonstrate that there is a tendency of readers to converge towards AI results which, as expected, occurs more often in the ‘middle’ or ‘median’ grades rather than the extremes of grade.

CLINICAL RELEVANCE/APPLICATION:

With an increase in the number and variety of AI applications in radiology, it is important to consider the extent and relevance of the behavior-modifying effect of AI algorithms on radiologists.

Can AI Help Read Pediatric Chest X-rays? An independent Evaluation on 3,000+ Scans

Can AI Help Read Pediatric Chest X-Rays? An Independent Evaluation On 3,000+ Scans

PURPOSE:

To evaluate the performance of a commercially available deep learning-based AI algorithm on pediatric chest X-rays (CXRs).

METHOD AND MATERIALS:

3,319 frontal (PA and AP) CXRs of patients’ aged 6 to 18 years were pulled from PACS and anonymised at a tertiary care pediatric hospital in Brazil. Labels (normal, abnormal) were ascertained from the radiology reports. The data was loaded on to CARPL AI Research platform (CARING Research, India) for AI inference and validation-related statistical analysis. The algorithm under test was QXR Version 3.0 (Qure.ai, India). The algorithmic output consisted of three categories – “normal”, “abnormal” and “to be read”. The “to be read” scans,
which refer to cases where the scans are meant to be read by a radiologist directly, were excluded from calculation of summary statistics. False negative scans were re-read by a specialized pediatric radiologist with 6 years of experience.

RESULTS:

Out of the 3,319 cases, 1,802 were labeled as “to be read” and excluded from analysis. On the remaining 1,517 cases the algorithm gave a sensitivity of 91% and specificity of 96%. The 38 false negatives were reviewed and only 9 truly missed findings existed out of which 7 cases had consolidation, 1 had atelectasis and 1 had vascular engorgement.


Figure 1


CONCLUSION:

Our independent evaluation provides evidence of AI’s ability to accurately read and triage normal pediatric CXRs thereby saving significant time and effort on part of radiologists.

CLINICAL RELEVANCE/APPLICATION:

Most AI algorithms are trained on adult data and hence have poor performance on pediatric cases where lack of trained radiologists is a constant problem, especially in the developing and underdeveloped world.

Initial Clinical Experiences with EPIMIX Sequence in Multiple Brain Pathologies

Initial Clinical Experiences With EPIMIX Sequence In Multiple Brain Pathologies

TEACHING POINTS:

A new multi-contrast echo planar imaging sequence called EPIMIX has been described with a 72-75 second long
sequence providing a range of contrasts from T1 FLAIR, T2-weighted, T2-FLAIR, GRE T2*, Diffusion and ADC
images. We share our experience in a variety of brain conditions, where we employed EPIMIX in addition to
standard of care imaging. The best indications to use EPIMIX are sick or un co-operative patients needing faster
scan acquisition. This sequence runs out of the box, without any modifications necessary, with the capability to
increase the numbers of slices. Inbuilt MOCO (motion correction) aids in improving the image quality in uncooperative patients. Longer processing times are needed, ranging from 6-10 minutes after the scan. Lower
signal to noise ratio leads to increased image grain and poorer visualization of interfaces between lesions and
normal brain parenchyma



TABLE OF CONTENTS/OUTLINE:


1. Basic physics behind the sequence
2. Contrasts generated from the sequence
3. Pros and Cons of the sequence
4. Clinical experience in different indications: Infarcts, Neoplasms, Headache with normal scans, White matter lesions Infections such as tuberculomas or cysticercosis.



The presentation can be viewed here: Initial Clinical Experiences with EPIMIX

Assessment of Brain Tissue Microstructure by Diffusion Tensor Distribution MRI: An Initial Survey of Various Pathologies

Assessment Of Brain Tissue Microstructure By Diffusion Tensor Distribution MRI: An Initial Survey Of Various Pathologies

PURPOSE:

To explore the potential of the novel diffusion tensor distribution (DTD) MRI method for assessment of brain tissue microstructure in terms of nonparametric DTDs and derived parameter maps reporting on cell densities, shapes, orientations, and heterogeneity through a pilot study with single cases of neurocysticercosis, hydrocephalus, stroke, and radiation damage.

METHOD AND MATERIALS:

Four patients were scanned with a <5 min prototype diffusion-weighted (DW) sequence in conjunction to their regular MRI protocol on a GE MR750w 3T. DW images were acquired with spin echo-prepared EPI using TE=121ms, TR=3298ms, and in-plane resolution=3mm. DW was applied with four b-values up to 2000 s/mm2 for 37 isotropic and 43 directional encodings. Raw images were converted to per-voxel DTDs and metrics including means and (co)variances of tensor \”size\” (inversely related to cell density), shape, and orientation, as well as signal fractions from elongated cells (bin1, including WM), nearly isotropic cells (bin2, including GM), and free water (bin3, including CSF).

RESULTS:

Inspection of the parameter maps revealed the following conspicuous features. 1) neurocysticercosis: site of parasite (high bin3_fraction) enclosed by cyst (high bin2_fraction) and edema (high bin2_fraction and bin2_size); 2) radiation: damaged area (high bin1_fraction and bin1_size) surrounded by edema (high bin2_fraction and bin2_size); recurrent tumor: site of removed tumor filled by fluid (high bin3_fraction) lined with a rim of tumor (high bin2_fraction and elevated bin2_size); hydrocephalus: enlarged ventricles rimmed by thin intact WM (high bin1_fraction with bin1_orientation consistent with WM tracts); acute stroke: ischemic tissue (high bin1_fraction, low bin1_size) surrounded by penumbra (high cov_size_shape) (see Figure 1).


Figure 1. Diffusion Tensor Distribution (DTD) parameter maps for a case of acute stroke (arrows).


CONCLUSION:

The custom sequence for DTD can be applied as a minor addition to a clinical MRI protocol and provides novel
microstructural parameter maps with conspicuous features for a range of brain pathologies, thereby encouraging studies with larger patient groups and comparison with current gold standards.

CLINICAL RELEVANCE/APPLICATION:

The DTD method may enable detailed characterization of tissue microstructure in a wide range of brain pathologies.