How Can We Be Sure Artificial Intelligence Is Safe For Medical Use?

Apr 14, 2019
Originally published on April 14, 2019 1:26 pm

When Merdis Wells visited the diabetes clinic at the University Medical Center in New Orleans about a year ago, a nurse practitioner checked her eyes to look for signs of diabetic retinopathy, the most common cause of blindness.

At her next visit, in February of this year, artificial intelligence software made the call.

The clinic had just installed a system that's designed to identify patients who need follow-up attention.

The Food and Drug Administration cleared the system — called IDx-DR — for use in 2018. The agency said it was the first time it had authorized the marketing of a device that makes a screening decision without a clinician having to get involved in the interpretation.

It's a harbinger of things to come. Companies are rapidly developing software to supplement or even replace doctors for certain tasks. And the FDA, accustomed to approving drugs and clearing medical devices, is now figuring out how to make sure computer algorithms are safe and effective.

Wells was one of the first patients at the clinic in early February to be tested with the new device, which can be run by someone without medical training. The system produces a simple report that identifies whether there are signs that a patient's vision is starting to erode.

Wells had no problem with the computer making the call. "I think that's lovely!" she says.

"Do I still get to see the pictures?" Wells asks nurse practitioner Debra Brown. Yes, Brown replies.

"I like seeing me because I want to take care of me, so I want to know as much as possible about me," Wells says.

The 60-year-old resident of nearby Algiers, La., leans into the camera, which has an eyepiece for each eye.

"It's just going to be like a regular picture," Brown explains. "But when we flash, the light will be a little bright."

Once Wells is in position, Brown adjusts the camera.

"Don't blink!" she says. "3-2-1-0!" The camera flashes and captures the image. Three more flashes and the exam is done.

She says still planning to examine the images and backstop the computer's conclusion. That reassures Wells.

The test is quick and easy, which is by design. People with diabetes are supposed to get this screening test every year, but many don't. Brown says the new system could allow the clinic to screen a lot more patients for diabetic retinopathy.

That's the hope of the system's inventor, Michael Abramoff, an ophthalmologist at the University of Iowa and company founder.

"The problem is many people with diabetes only go to an eye-care provider like me when they have symptoms," he says. "And we need to find [retinopathy] before then. So that's why early detection is really important."

Abramoff spent years developing a computer algorithm that could scan retina images and automatically pick up early signs of diabetic retinopathy. And he wanted it to work in clinics, like the one in New Orleans, rather than in ophthalmologists' offices.

Developing the computer algorithm wasn't the hard part.

"It turns out the biggest hurdle, if you care about patient safety, is the FDA," he says.

That hurdle is essential for public safety, but not an easy one for a brand-new technology — especially one that makes a medical call without an expert on hand.

Often medical software gets an easy road to market, compared with drugs. Software is handled through the generally less rigorous pathway for medical devices. For most devices, the evaluation involves a comparison with something already on the market.

A retinal image shows severe nonproliferative diabetic retinopathy, a vision-threatening form of the disease, characterized by hemorrhages (the darker red spots in the image) across the retina.
Courtesy of IDx

But this technology for detecting diabetic retinopathy was unique, and a patient's vision is potentially on the line.

When Abramoff approached the FDA, "of course they were uncomfortable at first," he says, "and so we started working together on how can we prove that this can be safe."

Abramoff needed to show that the technology was not just safe and effective but that it would work on a very diverse population, since all sorts of people get diabetes. That ultimately meant testing the machine on 900 people at 10 different sites.

"We went into inner cities, we went into southern New Mexico to make sure we captured all those people that needed to be represented," he says.

All the sites were primary care clinics, because the company wanted to demonstrate that the technology would well without having an ophthalmologist on hand.

That extensive test satisfied the FDA that the test would be broadly useable, and reasonably accurate. IDx-DR surpassed the FDA's requirement. Test results that indicated eye disease needed to be correct at least 85 percent of the time, while those finding no significant eye damage needed to be correct at least 82.5 percent of the time.

"It's better than me, and I'm a very experienced retinal specialist," Abramoff says.

The FDA helped guide the company's software through its regulatory process, which is evolving to accommodate inventions flowing out of artificial intelligence labs.

Bakul Patel, associate director for digital health at the FDA, says that in general, the FDA expects more evidence and assurances for technologies that have a greater potential to cause harm if they fail.

Some software is completely exempt from the FDA process. A simple tweak in a routine piece of software may not require any FDA review at all. The rules get tighter for a change that could substantially alter the performance of an artificial intelligence algorithm.

The agency has years of experience approving software that is part of medical devices, but new algorithms are creating new challenges.

For one thing, the agency needs to be wary of approving an algorithm that's based on a particular set of patients, if it's not clear that it will be effective in different groups. An algorithm to identify skin cancer may be developed primarily on white patients and may not work on patients with darker skin.

And many algorithms, once on the market, will continue to gather data that can be used to improve their performance. Some programs outside of health science continually update themselves to accomplish that. That raises questions about how and when updated software needs another round of review.

"We realize that we have to re-imagine how we look at these things, and allow for the changes that go on, especially in this space," Patel says.

To do that, the FDA is testing out a whole new approach to clearing algorithms. The agency is experimenting with a system called precertification that puts more emphasis on examining the process that companies use to develop their products, and less emphasis on examining each new tweak. Continued monitoring is another element of this strategy.

"We're going to take this concept and take it on a test run," Patel says.

A retinal scan is displayed at University Medical Center in New Orleans using software detects is called diabetic retinopathy.
Richard Harris / NPR

Because many algorithms will likely be in a state of continual evolution, "it's really important when a system is deployed in the real world that we monitor those systems to make sure that they're performing the way we expect," says Christina Silcox, a researcher at the Duke-Margolis Center for Health Policy.

She's enthusiastic about the prospects of AI in medicine, while alert to some of the challenges the FDA will face.

"Right now we might see an update to a medical device every 18 months," she says. "In software you might expect to see one every two weeks or every month."

Seemingly minor software glitches can occasionally have serious unintended consequences. One of the worst cases involved a radiation therapy machine that, in the 1980s, gave huge overdoses of radiation to some patients because of a software bug.

Researchers looking at more recent incidents identified 627 software recalls by the FDA from 2011 through 2015. Those included 12 "high risk" devices such as ventilators and a defibrillator.

Patel certainly doesn't want to see a high-profile failure, because that could set back a promising and rapidly growing industry.

One challenge that's beyond the FDA's scope is figuring out how to resolve conflicting conclusions from rival devices. Genetic tests that are used to guide cancer treatment, for example, already provide conflicting treatment recommendations, says Isaac Kohane, a pediatrician who heads the biomedical informatics department at Harvard Medical School. "Guess what," he says, "The same thing is going to happen with these AI programs."

"We're going to have built-in disagreements and no doctor and no patient will know what is right," he says.

Indeed, IDx isn't the only company that interested in using an algorithm to identify early signs of diabetic retinopathy. Among its competitors is Verily, one of Google's sister companies, which is currently deploying its technology in India. (Google is among NPR's financial supporters).

"Actually I'm quite bullish in the long term," Kohane says, as he looks out on the burgeoning field of AI. "In the short term, it's wild land grab."

He says we need the equivalent of Consumer Reports in this area to help resolve these disagreements and identify superior technologies. He would also like reviews to examine not simply whether a technology performs as expected, but if it's an improvement for patients. "What you really want is to get healthy," he says.

The cost of the camera and set-up for the IDx-DR systems is around $20,000, a company spokesperson said in an email. There are options to rent or lease-to-own the camera that can reduce the upfront costs.

The list price for each exam is $34, the spokesperson said. But it varies depending on factors including patient volume.

A technically accurate piece of software doesn't automatically lead to better health.

At the diabetes clinic in New Orleans, for example, the system replaced a service that also checked for another cause of blindness, glaucoma.

Nurse practitioner Brown visually scans Wells' images for signs of glaucoma, but that wouldn't happen when the work is handed off to someone who lacks her expertise. Instead, the diabetes clinic staff will refer patients to get another appointment for that test.

Wells also got something that future patients might not – a review of her retina images, so she could see for herself any suspected issues. That interaction with a health care professional was also an important moment to talk about her diet and what she can do to stay healthy.

Chevelle Parker, another nurse practitioner, points to some silvery lines inside the eye's blood vessels.

"That happens when your sugar levels are high," Parker explains. "It can also be an indication of diabetic retinopathy. So we're going to do a referral and send you on for complete testing."

The software did its intended job. While Wells seemed a bit upset by the news, at least she has found out about this concern early, while there's still time to protect her vision.

You can reach NPR Science Correspondent Richard Harris at rharris@npr.org.

Copyright 2019 NPR. To see more, visit https://www.npr.org.

SACHA PFEIFFER, HOST:

Making a medical diagnosis used to be a doctor's responsibility. But now you might be diagnosed in a different way with artificial intelligence. NPR's Richard Harris looks at how one potentially sight-saving technology got invented and approved and how it's already being used to detect disease.

(SOUNDBITE OF MACHINE BEEPING)

RICHARD HARRIS, BYLINE: The condition that the software detects is called diabetic retinopathy. It's the most common cause of blindness in the United States and an understandable worry to Merdis Wells, who was at the University Medical Center in New Orleans in early February to get her eyes checked.

DEBRA BROWN: You can come on in - straight ahead.

HARRIS: Last time she was here, the clinic used the expert diagnostic skills of nurse practitioners to do this screening.

So have you heard that this is a new machine?

MERDIS WELLS: No, I didn't know that.

HARRIS: A computer does the diagnosis.

WELLS: Oh, really? I didn't know that. Do I get to see the pictures?

BROWN: Yes, you do. You still get to see the pictures.

WELLS: Oh, OK. I love that.

HARRIS: You like seeing the picture.

WELLS: Yeah. I like seeing me (laughter) because I want to take care of me, so I want to know as much as possible about me.

HARRIS: And how do you feel about the computer helping make this diagnosis?

WELLS: I think that's lovely.

HARRIS: Nurse practitioner Debra Brown is also hopeful about the new technology. She says it could allow the clinic to screen a lot more patients for diabetic retinopathy. The device does not require her specialized knowledge of eye disease. She can pass the job off to anyone with a high school education.

BROWN: And we actually get a report that's printed out now immediately. And it tells us whether it's positive or negative. And we respond accordingly.

HARRIS: Merdis Wells is one of the very first patients to give it a go at this clinic. Information for the computer algorithm is gathered in an instrument that takes pictures of her retinas. It has two eye pieces like a fancy microscope. Wells puts her chin on the chin rest.

BROWN: So I need you to look straight ahead. I'm going to take two pictures of each eye. OK? It's just going to be like taking a regular picture. But when we flash, the light would be a little bright. And we want you to look at that green dot.

HARRIS: Brown turns off the room lights and, after a quick adjustment, is ready to proceed.

BROWN: All right. Did you see that dot straight ahead?

WELLS: Uh huh.

BROWN: Keep it still there. Don't blink - three, two, one, zero - good job.

HARRIS: Flash, flash, flash. And in minutes, the test is over. Before we get to the results, it's worth spending a few minutes to understand how this machine, which can replace a human expert, came into being. It's the brainchild of Michael Abramoff, an ophthalmologist at the University of Iowa.

MICHAEL ABRAMOFF: The problem is many people with diabetes only go to an eye care provider like me when they have symptoms. And we need to find it before then. So that's why early detection is really important.

HARRIS: Abramoff spent years developing a computer algorithm that could scan retina images and automatically pick up early signs of diabetic retinopathy.

ABRAMOFF: And I thought, well, if we do this with a computer, it can be faster, maybe better. And especially, it can be done where the patients are.

HARRIS: He founded a company, IDx, to actually get this machine on the market.

ABRAMOFF: It turns out that the biggest hurdle, if you care about patient safety, is the FDA.

HARRIS: That hurdle is essential for public safety but challenging for a brand-new technology, especially one that makes a diagnosis without an expert on hand. Often, medical software gets an easier road to market than, say, a new drug. But this technology was unique. And a patient's vision is potentially on the line.

ABRAMOFF: And, of course, they were uncomfortable at first. And so we started working on how - together - on how we prove that this can be safe.

HARRIS: Abramoff needed to show that this was not just safe and effective, but it would work on a diverse population since all sorts of people get diabetes. That ultimately meant testing the machine out on 900 people at 10 different sites.

ABRAMOFF: We went into inner cities. We went into southern New Mexico to make sure we captured all those people that needed to be represented. But it was always primary care clinics, never an ophthalmology clinic.

HARRIS: That would put it to the test among the nonexperts who could well be running this machine. That extensive trial satisfied the FDA that the test would be suitable anywhere. And it did a reasonable job of picking up early signs of the disease.

ABRAMOFF: It's better than me. And I'm a very experienced retinal specialist.

HARRIS: The FDA approved this system called IDx-DR last April. Bakul Patel heads the FDA office that's shaping the algorithm approval process. Officials are bracing for a flood of applications because AI technology is evolving rapidly. Patel says, in general, the FDA expects more evidence and assurances for technologies that have a greater potential to cause harm if they fail.

BAKUL PATEL: So that's how we think about it. And that's the burden - are sort of the expectation we set on people who are making these products.

HARRIS: A simple tweak in a routine piece of software may not require any FDA review at all. The rules are tighter for a change that could substantially alter the performance of an artificial intelligence algorithm.

PATEL: We expect people to come back to FDA and review that.

HARRIS: It's a brave, new world here. And I'm wondering, how do you feel that you're grappling with this and sort of finding systems that makes sense for the entire spectrum of software that's coming down the pipe?

PATEL: We realize that we have to reimagine how we look at these things and allow for the changes that go on, especially in the space.

HARRIS: In fact, the FDA is testing out a whole, new approach to clearing algorithms. The agency is experimenting with a system called precertification, which puts more emphasis on examining the process that companies use to develop their products, less emphasis on examining each new tweak and more time tracking real-world performance.

PATEL: We are going to take this concept and take it on a test run.

HARRIS: Patel certainly does not want to see a high-profile failure. That could set back a promising and rapidly growing industry. Of course, FDA clearance is focused on the performance of the product itself. It doesn't guarantee medical care will get better. Every new technology has ripple effects. At the diabetes clinic in New Orleans, for example, this system replaced a service that also checked for another cause of blindness; glaucoma. This machine can't test for glaucoma. At least on this visit, Merdis Wells did get an expert to review her retina images with her.

CHEVELLE PARKER: All right. So here's your right eye. OK?

HARRIS: Nurse practitioner Chevelle Parker points to some silvery lines inside the eyes' blood vessels.

PARKER: That happens when your sugar levels are high. And it can also happen from smoking. OK?

WELLS: OK.

PARKER: It could also be an indication of diabetic retinopathy.

WELLS: OK.

PARKER: And so we're going to do a referral and send you on for complete testing.

HARRIS: The software did its intended job. Wells is unsettled by the news and wants to leave as quickly as possible.

WELLS: Thank you, ma'am.

PARKER: No problem at all.

WELLS: Nice meeting y'all.

HARRIS: At least she's found out about this concern early while there's still time to treat it.

Richard Harris, NPR News. Transcript provided by NPR, Copyright NPR.