Ground Truths
Ground Truths
John Halamka: How Mayo Clinic is Transforming Healthcare with A.I.
0:00
Current time: 0:00 / Total time: -33:35
-33:35

John Halamka: How Mayo Clinic is Transforming Healthcare with A.I.

Transcript

Eric Topol (00:00):

This is a real great opportunity to speak to one of the most impressive medical informaticists and leaders in AI in the United States and worldwide. Dr. John Halamka, just by way of background, John, his baccalaureate in Stanford was at U C S F/Berkeley for combined MD PhD trained in emergency medicine at U C L A. He went on to Harvard where he, for 20 years was the Chief Information Officer at Beth Israel Deaconess. And then in 2020 he joined Mayo Clinic to head its platform to help transform Mayo Clinic to be the global leader in digital healthcare. So welcome, John. It's so great to have you. And by the way, I want to mention your recent book came out in April, one of many books you've written, redefining the Boundaries of Medicine, the High Tech High Touch Path into the Future.

John Halamka (01:00):

Well, a thrilled to be with you today, and you and I need to spend more time together very clearly.

Eric Topol (01:06):

Yeah, I really think so. Because this is the first time we've had a one-on-one conversation. We've been on panels together, but that's not enough. We've got to really do some brainstorming, the two of us. But first I wanted to get into, because you have been on a leading edge of ai and Mayo is doing big things in this space, what are you excited about? Where do you think things are right now?

John Halamka (01:35):

So you and I have been in academic healthcare for decades, and we know there's some brilliant people, well-meaning people, but sometimes the agility to innovate isn't quite there, whether it's a fear of failure, it's the process of getting things approved. So the question of course is can you build to scale the technology and the processes and change policies so that anyone can do what they want much more rapidly? And so what's been exciting over these last couple of years at Mayo is we started with the data and we know that anything we do, whether it's predictive or regenerative, starts with high quality curated data. And so by de-identifying all the multimodal data of Mayo and then working with other partners around the world to create a distributed federated approach for anyone to train anything, suddenly you're empowering a very large number of innovators. And then you've seen what's happened in society. I mean, culturally, people are starting to say, wow, this ai, it could actually reduce burden, it could democratize access to knowledge. I actually think that yes, there need to be guidelines and guardrails, but on the whole, this could be very good. So here we have a perfect storm, the technology, the policy, the cultural change, and therefore these next couple of years are going to be really productive.

Implementing a Mayo Randomized AI Trial

Eric Topol (02:59):

Well, and especially at Mayo, the reason I say that is not only do they recruit you, having had a couple of decades of experience in a Harvard program, but Mayo's depth of patient care is extraordinary. And so that gets me to, for example, you did a randomized trial at Mayo Clinic, which there aren't that many of by the way in AI where you gave E C G reading power of AI to half the primary care doctors and the other half you didn't for determining whether the patients had poor cardiac function that is low ejection fraction. And now as I understand it, having done that randomized trial published it, you've implemented that throughout the Mayo Clinic system as far as this AI ECG support. Is that true?

John Halamka (03:56):

Well, right, and let me just give you a personal example that shows you how it's used. So I have an SVT [supraventricular tachycardia] , and that means at times my resting heart rate of 55 goes to one 70. It's uncomfortable. It's not life-threatening. I was really concerned, oh, may I have underlying cardiomyopathy, valvular disease, coronary artery disease. So Paul Friedman and Peter Newsworthy said, Hey, we're going to take a six lead ECG wearable, send it to your home and just record a bunch of data and your activities of daily living. And then we buy 5G cell phone. We'll be collecting those six leads and we'll run it through all of our various validated AI systems. And then we'll tell you based on what the AI suggests, whether you're at high risk or not for various disease states. So it says your ejection fraction 70%. Oh, good. Don't have to worry about that. Your likelihood of developing AFib 3% cardiomyopathy, 2% valvular disease, 1%. So bottom line is without even going to a bricks and mortar facility here, I have these validated algorithms, at least doing a screen to see where maybe I should get additional evaluation and not.

Eric Topol (05:12):

Yeah, well see what you're bringing up is a whole other dimension. So on the one hand that what we talked about was you could give the primary care doctors who don't read electrocardiograms very well, you give them supercharged by having a deep learning interpretation set for them. But on the other, now you're bringing up this other patient facing story where you're taking a cardiogram when somebody's perfectly fine. But from that, from having deep learning of cardiograms, millions of cardiograms, you're telling what their risks are that they could develop things like atrial fibrillation. So this is starting to span the gamut of what the phase that we went through or still going through, which is taking medical images, whether it's a cardiogram or a scan of some sort, and seeing things with machines that humanize really can't detect or perceive. So yeah, we're just starting to get out of the block here, John. And you've already brought up a couple of major applications that we were not even potentially used three, four or five years ago that Mayo Clinics leading the charge, right?

The Power of Machine Eyes

John Halamka (06:26):

Well, yeah, and let me just give you two quick other examples of these are in studies now, right? So they're not ready for active patient use. The animate GI product does an overread of endoscopy. And what we're finding is that the expert human, I mean anywhere in the world, expert humans miss about 15% of small polyps. They're just hard to see. Prep may not be perfect, et cetera. The machine misses about 3%. So that's to say a human augmented with overread is five times better than a human alone pancreatic cancer, my father-in-law died about 11 years ago of stage four pancreatic cancer. So this is something that I'm very sensitive about, very often diagnosed late, and you can't do much. What we've been able to see is looking at pancreatic cancer, early films that were taken, abdominal CT scans and these sorts of things, algorithms can detect pancreatic cancer  two years before it is manifested clinically. And so here's the ethical question I'll pose to you. I know you think about a lot of this Scripps Mayo, UCSF, Stanford, we probably have thousands and thousands of abdominal CTs that were read normal. Is it an ethical imperative as these things go through clinical trials and are validated and FDA approved to rerun algorithms on previous patients to diagnose disease we didn't see?

Eric Topol (08:03):

Well, that is a really big important question because basically we're relieving all this stuff on the table that doesn't get diagnosed, can't be predicted because we're not even looking for it. And now whether it's retina, that is a gateway to so many systems of the body, or as you're mentioning various scans like an abdominal CT and many others that like mammography for heart disease risk and all sorts of things that weren't even contemplated that machine eyes can do. So it's really pretty striking and upending cancer diagnosis, being able to understand the risk of any individual for particular types of cancer so that you can catch it at the earliest possible time when it's microscopic before it spreads. This, of course, is a cardinal objective. People don't die of cancer per se. They die of its metastasis, of course, for the most part. So that gets me now to the next phase of ai because what we've been talking for mostly so far has been what has been brewing culminating for the last five years, which is medical images and what, there's so many things we can glean from them that humans can't including expert humans in whatever discipline of medicine.

Multimodal AI and Social Determinants of Health

(09:19):

But the next phase, which you are starting to get at is the multimodal phase where you're not just taking the images, you're taking the medical records, the EHRs, you're getting the genomics, the gut microbiome, the sensors. You mentioned one, an ECGs, a cardiogram sensor, but other sensors like on the wrist, you're getting the environmental things like air pollution, air quality and various things. You're getting the whole ball of wax any given individual. Now, that's kind of where we're headed. Are you doing multimodal ai? Have you already embarked in that new path? Now that we have these large language models

John Halamka (10:02):

And we have, and so like anything we do in healthcare innovation, you need a Pareto diagram to say, what do you start with and where do you go? So in 2020, we started with all of the structured data problems, meds, allergies, labs. Then we went to the unstructured data, billions of notes, op reports, H and Ps, and then we moved to telemetry, and then we moved to CT, MRI, PET. Then we move to radiation oncology and looking at all the auto contouring profiles used in linear accelerators and then to omic, and now we're moving to an inferred social determinants of health. And let me explain that for a minute.

(10:45):

Exposome, as you point out, is really critical. Now, do you know if you live in a Superfund site area, do you know what risks you might have from the PM 2.5 particulates that are blowing through San Diego? Probably you don't. So you're not going to self-report this stuff. And so we have created something called the house Index where we've taken every address in the United States, and based on the latitude and longitude of where you live, we have mapped air, water, land, pollution, access to primary care, crime, education, grocery stores, stores, and therefore we can infer about 40 different things about your expose em just from where you live. And that's a mode. And then as you say, now, starting to gather remote patient monitoring. We have this acute advanced care in the home program where we're taking serious and complex illness, caring for the patient in the home, starting to instrument homes and gather a lot more telemetry. All of that multimodal data is now available to any one of the 76,000 employees of Mayo and our partners for use in algorithm development.

Eric Topol (11:58):

Yeah, no, that's extraordinary. And I also would say the social determinants of health, which you've really gotten into as its importance. There are so many papers now over the last several years that have emphasized that your zip code is one of the most important things of your health. And it's not even just a zip code. It's your neighborhood within that zip code for the reasons that you've mentioned. And inferring that and imputing that with other sources of data is vital. Now, this multimodal, you've again anticipated one of my questions, the possibility that we can gut hospitals as we know them today. Yes, preserving the ICUs, the emergency departments, the operating rooms, but those other people that occupy the vast majority of beds in the hospital that are not very sick, critically Ill. Do you think we're going to move as you're innovating at Mayo whereby we'll be able to keep those people at home for the most part in the years ahead? I mean, this isn't going to happen overnight, but do you think that's where we're headed?

The Hospital-at-Home

John Halamka (13:08):

So to date, Mayo and its partners have discharged about 23,000 patients from their homes. And as you can guess, we have done clinical trials and deep dive studies on every one of the patient's journeys. And what have we seen across 23,000 patients? Well, so generally, about 30% of patients that present for acute care to an emergency department come in by ambulance are appropriate for care in non-traditional settings. I mean, I think you would agree, somebody with episodic ventricular tachycardia, you're probably not going to put in a home setting, but somebody with congestive heart failure, COPD, pneumonia, I mean, these are things that, as you say, if they're going to get sicker, it will be over hours, not minutes. And therefore you can adjust in these molar than 20,000 patients. What we've seen is the outcomes are the same, the quality is the same safety, the same patient satisfaction. You get net promoter scores in the mid-nineties. You find me a hospital with a net promoter score in the mid nineties. You're eating your own food, slipping your own bed. Oh, your granddaughter's coming at 2:00 AM on a Sunday, whatever. And then ask yourself this other question, nosocomial infections,

Eric Topol (14:31):

Right?

John Halamka (14:31):

How many methicillin resistant staph infections do you have in your office? You're like, none, right? So you're infections in fall, so okay, better, stronger, cheaper, faster. And the safety of the quality are that for about 30% of the population should be a standard of care.

Eric Topol (14:56):

That's really big. So you don't think we have to do randomized trials to prove it?

John Halamka (15:01):

I mean, we have done enough studies to date, and there are organizations, Kaiser Permanente, Cleveland Clinic, all these folks who are joining us in investigating these areas. And the data is very compelling.

Patients Asking Questions to LLMs

Eric Topol (15:17):

Yeah, that's really exciting. And we may be able to jump past having to go through the large trials to prove what you just reviewed. So that's one thing of course that we're looking for in store. Another is the patient doing advanced large language model searches. So as you and everyone knows, we've done Google searches for years about symptoms, and inevitably people come up with hypochondria because they have some horrible disease that they looked up that is not a very good match specific for their condition and their background. But soon already today, we have people going into being creative mode, G P T four and other searches, and they're getting searches about their diagnosis and about what's the best literature and best treatments and expectations. That won't be FDA regulated. We don't have regulation of Google searches. So how do you see the democratization of large language models with patients having conversations with these chatbots?

John Halamka (16:32):

And of course, you ask a question no one has answered yet, but here are a few threads. So we know the challenge with existent commercial models as they're trained on the public internet. Some are trained on additional literature like PubMed or a mimic dataset, but none are trained on the rich clinical experience of millions and millions of patients. So therefore, they don't have the mastery of the care journey. So question, we are all asking, and again, no one knows. Then you take a GPT, BARD, a MedPaLM and additional pre-training with rich de-identified clinical experience and make it a better model for patients who are going to ask questions. We've got to try and we've got to try within guardrails and guidelines, but we definitely want to explore that. Can you or should you train a foundational model from scratch so that it doesn't have the bias of Reddit and all of the various kinds of chaff you find on the public internet? Could be very expensive, could be very time consuming. Probably society should look at doing it.

Eric Topol (17:50):

So this is just a review for those who are not up to speed on this, this means setting up a base model, which could be 20 to 30,000 graphic processing units, big expense. We're talking about tens of millions, but to do it right, so it isn't just a specialized fine tuning of a base model for medical purposes, but something that's de novo intended that no one's done yet. Yeah, that's I think a great idea if someone were to go down that path. Now you, early on when we were talking, you mentioned partners, not just other health systems, but one of the important partners you've established that's been out there as Google, which I think set up shop right in Rochester, Minnesota, so it could work closely with you. And obviously they have MedPaLM2, they have BARD, they published a lot in this space. They're obviously competing with Microsoft and others, but seems like it's mainly an arms race between those two and a few others. But how is that relationship going? And you also were very right spot on about the concerns of privacy, federated ai, privacy computing. Can you tell us about Mayo and Google?

What is the Collaboration Between Mayo and Google?

John Halamka (19:06):

Well, absolutely. So Google provides storage, compute, various kinds of tools like their fire engine for moving data between various sources. Google does not have independent access to any of Mayo's data. So this isn't a situation of we have a challenging medical or engineering problem, bring 60 Google engineers to work on it. No, what they mean is they help us create the tooling and the environment so that then those with permission, Mayo employees or Mayo's partners can work through some of these things and build new models, validate models. So Google has been a great enabler on the tool set and building scale. You probably saw that Eric Horvitz gave a recent grand rounds at Stanford where he explained scale makes a difference, and that you start to see these unexpected behaviors, this emerging goodness, when you start dealing with vast amounts of multimodal data, vast amounts of compute. And so working with a cloud provider is going to give you that vast amounts of compute. So again, privacy, absolutely essential, de-identify the data, protect it, control it, but you can't as an institution, get enough computing power locally to develop some of these more.

Towards Keyboard Liberation and Machine Chart Review

Eric Topol (20:36):

Well, that goes back to the dilemma about building a base model with just the capital costs no less. You can't even get these GPUs scale because their supply and demand mismatch is profound. Well, the other thing, there's two other areas I want to get your impressions about. One of course is the change of interactions with patients. So today, as you well know, having all these years overseeing the informatics, Beth Israel now Mayo, the issue of the keyboard and the interference that it provides, not just as a data clerk burden to clinicians, which is horrible for morale and all the hours even after seeing patients that have to be put into charting through the EHRs and these clunky software systems that we are stuck with, but also the lack of even having face-to-face eye contact with patients in that limited time they have together. Now, there are many of these so-called ambient AI language, natural language processing, using large language models that are of course turning that conversation not just to a remarkable note, but also of course any part of the note, you could go back to the raw conversation. So it has trust embedded as what was really said. And then you have all these downstream functions like prescriptions, follow-up appointments, nudges to the patients about whatever, like their blood pressure or things that were discussed in the visit. You have translation to the patient at their level of education so they can understand the note you have things that we never had before. You have orders for the test or follow up appointment pre-authorization. What about these, John, are these the real deal or are we headed to this in the near term?

John Halamka (22:41):

So 10 years ago, I said all of these meaningful use criteria, all the keyboarded data entry, structured data and vocabularies. What if you had the doctor and the patient had a conversation and the conversation was the record? That was the legal record. And then AI systems extracted the structured data from the conversation. And there you would have satisfaction by both patient and doctor and a very easy source of truth. Go back to what was said. And of course, 10 years ago everyone said, that'll never happen. That's too far.

(23:20):

And so I'll give you a case. My mom was diagnosed with a brain abscess about a year ago. She's a cure of the brain abscess. I with ambient listening, had a conversation with my mother and it went something like this. Yes, I started to develop a fever. I said, oh, and you live alone, right? Oh, yes. My husband died 13 years ago. The note comes out, the patient is an 81 year old widow. So we're having a conversation about my father dying and she lives alone. And I didn't use the word widow, she didn't use the word widow. And so what it shows you is these systems can take detailed conversation, turn them into abstract concepts and record them in a way that's summarized and meaningful. Last example I'll give you recently, I did grand rounds at Mayo and I said, here's a challenge for all of us. It's Sunday at three in the morning. Mrs. Smith has just come in. She has a 3000 page chart, 75 hospitalizations and four or visits. Her complaint tonight is, I feel weak,

Eric Topol (24:38):

Right? That's a classic.

John Halamka (24:43):

How are you going to approach that? So we have an instance of MedPaLM2 that is containerized. So that I was able to put a prompt in it with some background data without, and it was all de-identified, but it was all very secure. So I put the 3000 pages into this MedPaLM2 container and said, audience, ask any question that you want. Oh, well, what medication should she be taking? What's her follow-up plan? Were there any complications in any of her surgeries? And within seconds, every answer to every question just appears. They say, oh my God, I can now treat the patient. And so this is real. It is absolutely. It's not perfect, but give us a couple of quarters.

Eric Topol (25:31):

Yeah, quarters not even years. I think you're putting the finger on something that a lot of people are not aware, which is when you have complex patients like what you just described, that woman, and you have so much information to review, no less the corpus of the medical literature, and you have help with diagnosis treatments that you might not otherwise thought of. It also gets me back to a point I was going to make the machine vision during colonoscopy where it does pick up these polyps, but it was shown that at the end of the day in the afternoon for gastroenterologists that are doing colonoscopies all day, their pickup rate drops down. They get tired, their eyes are just not working as well. And here your machines, they don't get tired. So these things are augmenting the performance of physicians, clinicians across the board potentially.

(26:28):

And yes, there's a concern as you touched on about confabulation or hallucinations, whatever, but this is a work in progress. There will be GPT-X, BARD-15 or whatever else right now, another area that is hot, which is still very in the earliest nascent stage, is the virtual medical coach. Whereby any of us with all our data, every visit we've ever had, plus our data that's in real time accruing or scans or slides or whatever it is, is all being fed in process with the medical literature and helping us to prevent a condition that we would have high risk to develop or manifest or better management of the various things we do have that we've already declared. What about that, John? Are we going to see virtual medical coaches like the kind we see for going to the airport, or you have an appointment such and such about your daily life, or is that something that is way out there in time?

John Halamka (27:37):

I know you're going to hate this answer. It depends.

Eric Topol (27:41):

Okay. I don't hate that. I like it actually. Yeah.

John Halamka (27:44):

So some years ago, one of my graduate students formed a virtual coaching company, and what he found was patients would often start with a virtual coach, but they wouldn't stick with it because the value add wasn't necessarily there. And that is it wasn't then every day there was something new or actionable. And so if it's few and far between, why do you want to go through the effort of engaging in this? So I think our answer there is we need to make sure that the person who uses it is getting something of value for using it. Reduced insurance rates, free club memberships to a gym, whatever, something of value. So it gets some stickiness.

Virtual AI Coaching

Eric Topol (28:33):

Yeah. Well, it's still early and right now, as you well know, it's really confined to certain conditions like diabetes or depression or high blood pressure. But it certainly has the chance in the years ahead to become broad for any individual. And that gets back to the patient scenario that you presented where you had all the data of that woman who presented with weakness as the inputs. And just think about that happening in real time, giving feedback to any given individual, always thinking that it's optional. And as you say, maybe it'd be more elective. There were incentives, and if people don't want it, they don't have to use it, but it's something that's out there dangling as a potential. Well, of the things we've discussed, there are many potential ways that AI can be transformative in the future, both for clinicians, for health systems, for patients. Have I missed anything that you're onto?

John Halamka (29:40):

Just that in predictive AI, we can judge performance against ground truth. Did you have the disease or not? Did you get a recommendation that was followed up on and it was positive? With generative ai measuring quality and accuracy, doing follow up and oversight is much harder. So I think what you're going to see is FDA and the office of the national coordinator and the White House work through generative AI oversight. It's going to start with, as we've seen voluntary oversight from some of the companies themselves. And it will evolve into maybe some use cases that are considered reasonable practices and others that we defer reasonable practices. Hey, you want an agent that will pre-draft your email and then you just edit it, that's fine. And Mayo is live with that in Epic inbox. How about help you write a letter or help you take, as you say, a very complex medical condition, explain it in eighth grade English or a foreign language. Very good at all of that differential diagnosis, not quite ready yet. And so I think we'll start with the administrative use cases, the things that reduce burden. We'll experiment with differential diagnosis. And I don't think we yet have line of sight to say, actually, we're going to have the generative ai do your diagnosis

(31:09):

Not there yet,

Machines Promoting Empathy

Eric Topol (31:10):

Right? Perhaps we'll never be, particularly for important diagnoses, maybe for routine things that are not a serious matter. One thing that I didn't anticipate, and I want to get your view. When I wrote deep medicine, I was talking about restoring the patient-doctor relationship and the gift of time that could be garnered from having this machine support. But now we're seeing the evidence that the AI can promote empathy. So for example, reviewing a doctor's note and telling the doctor, you didn't show you're very sensitive. You weren't listening, making suggestions for being a more empathic physician or nurse. Did you foresee that too? Because you've been ahead of the curve on all this stuff.

John Halamka (32:04):

So here's an interesting question. You and I are physician, scientist, writers. How many physician scientist writers are there? Not so many. So what you get are brilliant math or brilliant science, and it is communicated very badly. So I did not anticipate this, but I'm saying the same thing you are, which is you can take a generative AI and take something that is not very digestible and turn it to something highly readable. And whether that's empathy or clarity or whatever, it actually works really well.

Eric Topol (32:43):

Yeah. Yeah. I mean, I kind of stunned by this because the machines don't know empathy. They can't feel empathy, but they can promote it. And that's really fascinating. So this has been an uplifting discussion. A lot of the things that's happening now give credit to you that you saw coming long before others, and it's a real joy. So we got to keep up with each other. We got to do some more brainstorming on the things that we haven't discussed today. But thanks so much, John, for joining me and for being such a bright light for the work you're doing with Mayo Clinic as a president of its platform. That's no question. Transforming the future of healthcare.

John Halamka (33:25):

Well, hey, thanks for having me. And I would say both you and I have taken the digital Hippocratic Oath. We will do no digital harm.

Eric Topol (33:33):

Love it.

Share

Ground Truths
Ground Truths
Facts, data, and analytics about biomedical matters.