Is AI technology in radiology racist?



In a prepublication article published on July 21 on, a group of researchers brought together by Dr. Judy Gichoya of Emory University in Atlanta shared the results of dozens of experiments performed at several institutions around the world that found extremely high accuracies – down to an area below the curve (AUC) of 0.99 – for deep learning models in recognizing a patient’s self-proclaimed racial identity from X-ray images.

The models have demonstrated consistently high performance in multiple imaging modalities and anatomical locations – including chest x-rays, mammography, limb x-rays, cervical spine x-rays, and chest CT data – on a variety of different patient populations and even very low quality patients. images. But researchers were unable to determine how the algorithm learns to predict racial identity.

Although the study is yet to be peer reviewed, its findings raise troubling concerns. AI’s ability to predict self-reported race isn’t the biggest issue, the researchers say.

“However, our findings that AI can trivially predict self-reported race – even from corrupted, cropped, and noisy medical images – in a setting where clinical experts cannot, creates enormous risk for all. Model deployments in medical imaging: If an AI model secretly used its knowledge of self-reported race to misclassify all black patients, radiologists wouldn’t be able to tell using the same data the model has access, ”the authors wrote.

As this ability was easily learned, it is also likely present in many image analysis models, “providing a direct vector for the reproduction or exacerbation of racial disparities that already exist in medical practice,” said Researchers.

In-depth training, validation

To assess the ability of deep learning algorithms to discover race from x-ray images, researchers first developed convolutional neural networks (CNNs) using three large chest x-ray datasets. (MIMIC-CXR, CheXpert and Emory-CXR) with external validation. They also trained detection models for non-x-ray chest images from multiple locations on the body to determine if the model’s performance was limited to chest x-rays. In addition, the authors investigated whether deep learning models could learn to identify racial identity when trained to perform other tasks, such as detecting pathologies and re-identifying patients.

Each of the data sets included black / African American and white labels; some data sets also included Asian labels. Hispanic / Latino tags were only available in some datasets and were coded heterogeneously; therefore, patients with these labels were excluded from the analysis.

Very precise predictions

In internal validation, deep learning algorithms for chest x-rays produced AUCs ranging from 0.95 to 0.99. They also achieved AUCs ranging from 0.96 to 0.98 in external validation.

Algorithms developed to detect race on other modalities yielded AUCs, for example, ranging from 0.87 on external validation on chest computed tomography studies, to 0.91 for radiographs of limbs, to 0.84 for mammography and 0.92 for cervical spine x-rays.

“All the radiologists I have spoken to about these results are absolutely flabbergasted, because for all our expertise, none of us would have believed in a million years that X-rays and CT scans contain such solid information about racial identity, ”according to a blog post by co-author Dr. Luke Oakden-Raynor of the University of Adelaide in Australia.

In an effort to find the underlying mechanisms or image characteristics used by the algorithms in their racial identity predictions, the researchers investigated several hypotheses, including differences in physical characteristics, disease distribution, site- or tissue-specific, or cumulative, phenotype or anatomical differences. effects of societal prejudices and environmental stress. However, all of these confounding factors in themselves had very poor predictive performance.

Surprising results

The results of the study are both surprising and very bad for efforts to ensure patient safety, fairness and the generalization of AI algorithms in radiology, Oakden-Raynor said.

“The features used by AI seem to occur across the entire spectrum of images and are not localized regionally, which severely limit our ability to prevent AI systems from doing that, ”he wrote.

Incredibly, the model was able to predict racial identity on an image whose low-frequency information had been suppressed to the point that a human couldn’t even tell the image was still an x-ray, he said. declared.

In other results, the experiments revealed that despite the distribution of diseases across multiple data sets being essentially non-predictive of racial identity, the model learns to identify the race of patients almost as well as models optimized for this purpose. , according to Oakden-Raynor. AI appears to easily learn racial identity information from medical images, even when the task doesn’t seem related, he noted.

“We can’t isolate how it does it, and we humans can’t recognize when AI is doing it unless we collect demographic information (which is rarely readily available to clinical radiologists),” he said. he writes.

This is bad, but even worse, there are many algorithms on the market for chest x-ray and chest computed tomography images that were trained using the same data sets that were used in their research, t -he declares.

“Our results indicate that future AI work in medical imaging should focus on explicit performance audits of models based on racial identity, gender and age, and that imaging data sets patients should include the self-reported race of patients where possible to allow for further investigation and research into the human-hidden but pattern-decipherable information that these images appear to contain related to racial identity, ”the authors wrote. authors.

The article has been submitted to a medical journal but has not yet been peer reviewed. In the meantime, the researchers said they hoped for further comments on improving the study.

Copyright © 2021



Leave A Reply