By Mark Watts
Breast Cancer Awareness Month (October) always brings a focus to prevention and detection for breast health. I wanted to discuss the latest findings from literature and conferences and how artificial intelligence (AI) in imaging is fitting into current care delivery.
The potential uses of AI in breast cancer imaging are rapidly expanding – moving beyond simple detection to improving detection and diagnostic accuracy, reducing workloads, reducing the need for unnecessary biopsies, and predicting response and guiding treatment decisions. Of course, some of the advances are closer to becoming a part of standard care than others. And, while numerous studies have been conducted, they are still retrospective, based primarily on existing data sets.
A further challenge in the use of AI is how ready the general public is to accept having a “machine” in charge of their diagnosis. A 2021 survey of 922 Dutch women found that 77.8% opposed standalone AI readings for mammography and 17% opposed even joint radiologist and AI readings. Similarly, a 2020 U.S. Food and Drug Administration (FDA) public workshop on the use of AI in imaging did not support the use of standalone AI in mammography.
Still, perceptions can change rapidly, as can research. The following is a brief review of some of the available literature on AI in digital mammography, digital breast tomosynthesis (DBT), ultrasound, and magnetic resonance imaging (MRI), keeping in mind that the field is changing and what we talk about can quickly become dated.
Digital Mammography
Interest in AI tools in mammography was jump-started in 2016 by the DREAM Challenge. This was an open, crowdsourced challenge to decrease the rate of false positives in mammography. Ultimately, the best teams working together did not beat radiologists’ accuracy, but the challenge moved the momentum forward for the use of AI in breast cancer screening. Ensuing studies have looked at both AI system accuracy as a standalone reader and its uses in conjunction with radiologists.
In 2021, the UK National Screening Committee commissioned a review of AI systems to detect cancer in digital mammograms. In 34 of 36 (94%) studies, AI systems were less accurate than a single radiologist, and all were less accurate than a consensus of two or more radiologists. The systems also lagged in the false-negative rate, as, although they could effectively screen out the normal mammograms, they missed between 0% and 10% of cancers detected by radiologists. Other trials have proved more successful for standalone AI. One trial that included 2,652 exams interpreted by 101 radiologists compared results with standalone AI. In this trial, the AI system proved statistically noninferior, with an average area under the receiver operating characteristic curve (AUROC) of 0.840 vs 0.814 for the radiologists.
Trials looking at the use of AI in conjunction with radiologists may point the way toward the most promising working models, including a triage model or a decision-referral model. In each, the goal is for the AI to help streamline the process, perhaps marking those scans that were straightforwardly negative versus those that were indeterminate, enabling the radiologist to spend more time on the less clear scans.
A retrospective trial compared mammographic examination results from 240 women interpreted by radiologists and by radiologists with AI support. The area under the curve (AUC) was higher in the group using AI support (0.90 vs 0.87 unaided); furthermore, sensitivity increased with AI support (86% vs. 83% without; P=0.046), and the reading time for both groups was similar. A large German study sought to explore a decision-referral approach by examining more than 1 million digital mammography studies. Comparisons of unaided radiologist readings and AI standalone showed that AI standalone was less accurate than the average unaided radiologist. In contrast, a decision-referral approach, in which the radiologist read the scan after the AI system did, improved on radiologists’ sensitivity by 2.6% and specificity by 1%. The AUROC in those cases was 0.982.
In two studies of a triaged workflow, 47% to 60% of readings were triaged, with a 0% to 7% missed cancer rate. One of the trials tested two workstreams – a no-radiologist workstream and an enhanced assessment where the AI software was the final reader. The standalone stream successfully detected cancers in the lowest 60% threshold, while using the AI software as the final reader increased cancer detection for the challenging cases. The second trial set AI threshold scores to determine whether some exams could be read only by the AI software. When they set the cut-off at 2 (scored between 1 and 9 for likelihood of tumor), they lowered the number of positive tumors missed to 1%.
A meta-analysis from 2022 of 14 articles covering more than 180,000 cases demonstrated that standalone AI could achieve or exceed human reader detection performance. For those that included triage, they found that up to 91% of normal mammograms could be identified, while missing 0% to 7% of cancers.
Digital Breast Tomosynthesis
The possibilities for AI in digital breast tomosynthesis (DBT) are very promising, both for accuracy and for decreasing reading times. Here, AI is being investigated for diagnostic purposes as well as screening. In a review of more than 12,000 cases with 24 readers, the concurrent use of AI by radiologists improved accuracy – AUC was 0.85 for AI vs. 0.8 for human readers – while reducing reading time by 52.7%. The use of AI by the readers was associated with increased sensitivity (8.4% improvement; P < .01) and specificity (6.9% improvement; P < .01). Much of the promise of AI in DBT lies in its ability to both reduce reading time and make lesions stand out better from the adjacent normal breast tissue (lesion conspicuity).
Ultrasound
FDA-approved systems designed to improve accuracy of ultrasound for breast cancer diagnosis are commercially available, and trials have demonstrated benefit from AI systems. A retrospective review of 900 breast lesions had readers read the cases twice, four weeks apart – once with ultrasound only and once with ultrasound plus an AI system. AI-based decision support improved accuracy of lesion assessment, with mean reader AUC for ultrasound alone of 0.83 vs. 0.87 with AI decision support.
A large study out of NYU used an AI system to identify malignant lesions on more than 288,000 exams to evaluate whether the use of AI could reduce false-positive findings. The AI system was able to automatically locate the malignant lesions – a development meant to increase confidence in the findings and to decrease the interpretation times. In the study, the AI system had a higher AUROC of 0.97, and the use of it for radiologists decreased their false-positive rates by 37.3% and requested biopsies by 27.8% while maintaining the same level of sensitivity.
Based on results like these, the study authors explored a triage approach where the AI system would sort the cases to those with very low probability of malignancy that would not be read by a radiologist to those with a low level of suspicion (read by a radiologist) and those with moderate to high level of suspicion (enhanced assessment). This approach, which still needs to be validated, would allow radiologists to spend more time on suspicious findings.
Magnetic Resonance Imaging
Dynamic contrast-enhanced (DCE)-MRIs have a high sensitivity in detecting breast cancer, but they also lead to increased numbers of unnecessary biopsies. Therefore, one of the big promises for AI in MRI is to reduce the number of unnecessary biopsies. Prediction models from the DENSE trial based on clinical characteristics and MRI findings were found to prevent 45.5% of false-positive recalls and 21.3% of benign biopsies without missing any cancers. Another model used a deep learning system to predict the probability of breast cancer in DCE-MRI exams. This deep learning system had a high standalone performance (equivalent to radiologists) and led to a 20% reduction of unnecessary biopsies in Breast Imaging Report and Data System (BI-RADS) 4 lesions.
The search to decrease the number of unnecessary biopsies has also led to AI exploration in cases with the BRCA mutation, which have a high rate of benign biopsies. In one retrospective study, a ML model improved diagnostic accuracy from 53.4% with consensus BI-RADS to 81.5%. Other avenues being explored include the use of AI-enhanced MRI to aid in molecular subtyping as well as in identifying phenotypes and tumor gene expressions, and ultimately in the prediction of treatment response and recurrence score.
Conclusion
As with all technology, there is a gap between the initial excitement of what the technology can do with the reality of putting it in practice, and then ultimately sorting it out as we figure out the actual capabilities. We are in a space where the research is developing rapidly and programs are commercially available. The FDA has created a list of AI/ML-enabled devices that is periodically updated on its website to try to keep up. We also need to see prospective studies and data validation as well as work through legal, ethical and regulatory issues before AI becomes a full partner in breast cancer imaging.
Mark Watts is an experienced imaging professional who founded an AI company called Zenlike.ai.

