By Mark Watts
My wife calls me in a panic, “the mechanic said our car could have blown up!” It was the five-year checkup for our Honda hybrid. It gets 45 miles to the gallon, gets me from point A to point B and starts every time I push the button on the dash. These useful criteria are how I judge a car. I had replaced the battery, Arizona’s extreme heat drains the life out of them. I must have used the “wrong battery” or not a factory solution. Cars are a black box to me. I cannot service it or change the oil. I just want to use them.
Artificial Intelligence has a black box issue too, as does the human brain.
If there’s one thing that the human brain and AI certainly share, it’s opacity. Much of a neural network’s learning ability is poorly understood, and we don’t have a way to interrogate an AI system to figure out how it reached its output. Move 37 in the historic AlphaGo match against Lee Sodol is a case in point: the creators of the algorithm can’t explain how it happened. The same phenomenon comes up in medical AI. One example is the capacity for deep learning to match the diagnostic capacities of a team of 21 board-certified dermatologists in classifying skin lesions as cancerous or benign. The Stanford computer science creators of that algorithm still don’t know exactly what features account for its success. A third example, also in medicine, comes from Joel Dudley at Mount Sinai’s Icahn Institute. Dudley led his team on a project called Deep Patient (I met Joel at a genetics conference in New York and asked for a picture with him, I said, “It is not every day you meet the Henry Ford of genetics.”) to see whether the data from electronic medical records could be used to predict the occurrence of 78 diseases. When the neural network was used on more than 700,000 Mount Sinai patients, it was able to predict using unsupervised learning from raw medical record data.
Dudley said something that sums up the AI black box problem. “We can build these models, but we don’t know how they work.” We already accept black boxes in medicine. For example, electroconvulsive therapy is highly effective for severe depression, but we have no idea how it works. Likewise, there are many drugs that seem to work even though no one can explain how. As patients we willingly accept this human type of black box, so long as we feel better or have good outcomes. Should we do the same for AI algorithms? Pedro Domingos would, telling me that he’d prefer one that’s 99 percent accurate but is a black box over one that gives explanation information but is only 80% accurate. But that is not the prevailing view. The AI Now Institute, launched in 2017 at New York University, is dedicated to understanding the social implications of AI. The number one recommendation of its AI Now report was that any “high stakes” matters, such as criminal justice, health care, welfare and education, should not rely on black-box AI.
The AI Now report is not alone. In 2018, the European Union General Data Protection Regulation went into effect, requiring companies to give users an explanation for decisions that automated systems make. That gets to the heart of the problem in medicine. Doctors, hospitals and health systems would be held accountable for decisions that machines might make, even if the algorithms used were rigorously tested and considered fully validated. The EU’s “right to explanation” would, in the case of patients, give them agency to understand critical issues about their health or disease management. Moreover, machines can get sick or be hacked. Just imagine a diabetes algorithm that ingests and processes multilayered data of glucose levels, physical activity, sleep, nutrition and stress levels, and a glitch or a hack in the algorithm develops that recommends the wrong dose of insulin. If a human made this mistake, it could lead to a hypoglycemic coma or death in one patient. If an AI system made the error, it could injure or kill hundreds or even thousands. Any time a machine results in a decision in medicine, it should ideally be clearly defined and explainable. Moreover, extensive simulations are required to probe vulnerabilities of algorithms for hacking or dysfunction. Transparency about the extent of and results from simulation testing will be important, too, for acceptance by the medical community.
Yet there are many commercialized medical algorithms already being used in practice today, such as for scan interpretation, for which we lack explanation of how they work. Each scan is supposed to be over read by a radiologist as a checkpoint, providing reassurance. What if a radiologist is rushed, distracted or complacent and skips that oversight, and an adverse patient outcome results? There’s even an initiative called explainable artificial intelligence that seeks to understand why an algorithm reaches the conclusions that it does. Perhaps unsurprisingly, computer scientists have turned to using neural networks to explain how neural networks work. For example, Deep Dream, a Google project, was essentially a reverse deep learning algorithm. Instead of recognizing images, it generated them to determine the key features. It’s a bit funny that AI experts systematically propose using AI to fix all of its liabilities, not unlike the surgeons who say, “When it doubt, cut it out.” There are some examples in medicine of unraveling the algorithmic black box. A 2015 study used machine learning to predict which hospitalized patients with pneumonia were at high risk of serious complications. The algorithm wrongly predicted that asthmatics do better with pneumonia, potentially instructing doctors to send the patients with asthma home. Subsequent efforts to understand the unintelligible aspects of the algorithm led to defining each input variable’s effect and led to a fix.
It’s fair to predict that there will be many more intense efforts to understand the inner workings of AI neural networks. Even though we are used to accepting trade-offs in medicine for net benefit, weighing the therapeutic efficacy and the risks, a machine black box is not one that most will accept. Yet, as AI becomes an integral part of medicine. Soon enough we’ll have randomized trials in medicine that validate strong benefit of an algorithm over standard of care without knowing why. Our tolerance for machines with black boxes will undoubtedly be put to the test.
I personally am OK with not knowing how the iPhone works inside or how an internal combustion engine works. The black box of AI will gain acceptance with usefulness and once we know it is safe and tested.
By the way, the right battery cost $275 plus installation. My wife feels safe now.
Mark A. Watts is the director informatics, technology and artificial intelligence and sales at Medical Technology Management Institute.