
Stanford University has opened a Center for Human-Centered Artificial Intelligence. I think this is a testament that one of the most important interface issues that must be overcome is not between information technology applications. It is the friction regarding the acceptance AI assistance can provide.
The deployment of artificial intelligence (AI) and machine learning (ML) in diagnostic radiology has been slowed by issues of effectiveness, trust, economics and regulation. I believe much of this can be mitigated by introducing a mechanism and infrastructure that support labeling of imaging studies at scale, automate cohort creation for algorithm training, provide real-world feedback on algorithms in development, and perform quality assurance and monitoring of deployed algorithms. This can be achieved by using computer-assisted interactive reporting technology as radiologists routinely interpret cases without burdening them.
At RSNA 2021 there were hundreds of AI algorithms for diagnostic imaging applications that have been created by academic researchers and commercial developers, although only a small fraction of these have been FDA cleared for marketing. Use of AI in clinical practice remains limited.
AI algorithms are fundamentally different from other medical devices and software in that they can learn. Algorithm development requires ready access to large volumes of accurately labeled training data, preferably with the image locations of pathology and significant findings annotated. For an algorithm to be generalizable and deployable in varied clinical settings, training data must reflect the diversity in the target patient population, equipment and protocols. Nonrepresentative training data will produce brittle, unreliable algorithms. A recent survey by the ACR documents that 9 of 10 radiologists find current AI algorithms to be inconsistent in accuracy.
Building large, expertly tagged, generalizable data sets and keeping them current is expensive. As such, most FDA-cleared AI algorithms for diagnostic radiology were trained with limited sets of retrospectively labeled data. They do not perform accurately in clinical practice, where the patient population and imaging protocols may differ from the data with which they were trained.
The performance of deployed algorithms may degrade over time as patient populations evolve, modalities and acquisition protocols are updated, and the prevalence and characteristics of disease change, exemplified by the coronavirus disease 2019 pandemic. Presently, AI algorithms are “locked” at the time of regulatory clearance; improvements by retraining with new data must be rereviewed by regulatory authorities. This review process is manual, cumbersome, lengthy and squanders the opportunity provided by tools that can learn.
Considering these limitations, regulatory agencies have signaled that ongoing surveillance and monitoring of algorithms may soon be mandatory to protect patient safety. Algorithm performance and patient safety may be further enhanced through safe, ongoing retraining of algorithms after clinical deployment, sometimes referred to as “continuous learning.”
More frequent algorithm updates can be provided if infrastructure and tools that allow real-time monitoring and feedback on algorithm performance are used. An automated system that documents algorithm retraining and confirms performance on new real-world studies can dramatically accelerate development and enable more frequent regulatory conforming updates, promoting patient safety and engendering trust by radiologists.
The best method of obtaining reliable feedback is review by interpreting radiologists during routine reading in a manner that does not materially slow interpretation. Manual approaches for feedback and monitoring, such as dialog questionnaires, impose on radiologists’ time and cause distraction. The application of natural language processing (NLP) to traditional text-based radiology reports for automatically extracting findings and comparing them with algorithm results is inaccurate, does not correlate findings in reports with locations on images, and is only a partial solution. However, it is possible to enhance the reporting software with features that, in conjunction with NLP, could automate the feedback and monitoring process and provide data for labeling radiology studies at scale, in a manner that can speed up algorithm development and iterative improvements.
In part two of this article, I will propose a list the steps to improve traditional text-based radiology reports with multimedia elements and interactive functionality.
The focus is on the radiologist and adding ambient intelligence to their workflow. To do this we must design a frictionless information technology platform that provides all shareholders with value.
Human-centered Artificial Intelligence … I think those leaders at Stanford may be on to something.
Mark A. Watts is the enterprise imaging director at Fountain Hills Medical Center.
Editor’s Note: This is Part I of a two-part series.

