Data & AI for Improving Patient Outcomes with Terry Myerson, CEO at Truveta
Terry Myerson is the CEO and Co-Founder of Truveta. Truveta enables scientifically rigorous research on more than 18% of the clinical care in the U.S. from a growing collective of more than 30 health systems. Previously, Terry enjoyed a 21-year career at Microsoft. As Executive Vice President, he led the development of Windows, Surface, Xbox, and the early days of Office 365, while serving on the Senior Leadership Team of the company. Prior to Microsoft, he co-founded Intersé, one of the earliest Internet companies, which Microsoft acquired in 1997.
Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.
Key Quotes
As a software engineer, we learn every day based upon the telemetry of how our systems are being used. And if you drive a Tesla, Tesla's learning every day about the bumps in the roads and what you do as you drive that car. And if you watch Netflix, Netflix is learning every day about what movies you're watching and whatnot. And.
There's this incredible rate of learning that goes on in all those systems. And then when I think about all of these professionals in the healthcare system or the life science, they don't have that real -time learning. And this is such an important part of our world, our lives, our economy. And I'm just excited to try and bring that kind of real -time learning and iteration into this whole industry. I just think they deserve it. So they can just do a group, they can. I think they will innovate faster, take better care of us.
The way health care learns today is historically something called a clinical trial, which offers incredible scientific rigor. However, they are slow and expensive and they're not iterative. When I think about how we all learn it, that iterative experience of just asking questions, being able to visualize it the right way that leads to the next question and that leads to the next question, that leads to the next question. That iterative continuous learning is something that's really not available for healthcare today. It will have a profound impact as it becomes available to everyone.
Key Takeaways
When working with healthcare data, it’s essential to balance data utility and privacy by following regulatory guidelines like HIPAA, utilizing de-identification techniques to ensure patient privacy while maintaining data usability.
Building high-quality, normalized datasets from semi-structured and unstructured healthcare data requires substantial data engineering efforts, including mapping across medications, diagnoses, and lab results.
To achieve regulatory-grade AI, invest in developing high-quality training and evaluation datasets annotated by clinical experts, ensuring transparency and accountability in model performance metrics like precision and recall.
Transcript
Richie Cotton: Hi, Terry. Thank you for joining us on the show.
Terry Myerson: Thanks for having me.
Richie Cotton: Excellent. So, to begin with, I just want to know, what is the current state of health records and what's the problem with this?
Terry Myerson: Well, I mean, it was eye opening to me during the, you know, we go back to 2020, the pandemic was raging and No one was learning from the data we collected during care. And it was just, I think we broke down the problem to three core issues. The data is very private.
There's a variety of regulations. There's the infamous HIPAA regulation, bunch of state regulations. And privacy of this data is paramount. And so, the data is not accessible. Second thing is that we all see multiple Providers that are part of multiple organizations since the data is fragmented across the providers.
So if you want to understand the longitudinal view of someone's health journey, you need to somehow acquire data from multiple providers and link it together. And then the final challenge is that data is unstructured. You have. x rays, you have the notes the doctor takes when you're visiting the doctor and saying, I've got this intense pain on my left knee.
It's similar to the pain I had on my right knee, which is similar to the pain my Aunt Edna had in her knee. You know, all of that's unstructured data in these notes. And really the only structured data in medical records is your medical bill. Everything else is unstructured. The... See more
that are not billed for, the side effects of any intervention that are done, which do not drive billing. Ultimately, the outcome of any intervention. That's all unstructured data in these notes and images. but that's what we need to really learn what's working and what's not.
Richie Cotton: Okay, so I think all three of those things you mentioned, so sort of a trade off between privacy and accessibility, and then problems with like data silos and even being able to discover the right bit of data, and then also how do you deal with that sort of data, All these three things are like very dear to the hearts of our audience.
So I want to quiz you all on these in more detail. Before we get to that, can you just tell me a bit more like give us some examples of how these things cause problems for people.
Terry Myerson: Well, I mean, the challenge that inspired the creation of Truvetta is we just didn't know which therapeutics were working for COVID or which were not. We had President Trump tweeting, the hydroxychloroquine is a cure for COVID. And you've got the World Health Organization tweeting, no, it's not.
And you're like, how could this possibly be? And. The reality was that data wasn't available to cross check who was right and who was wrong. And there was practical experience by individual physicians that had tried things out, but there was no time to run a clinical trial. What we needed to do was run a query to say, you know, patient with COVID who was given hydroxychloroquine, what was the outcome three months later?
And The fact that we cannot do that, not just for COVID, but for you know, I had knee surgery, and I was offered a choice of which type of graft to use in my knee. Should I want to use my hamstring, or my quadricep, or a cadaver, or, and there's no data to query and say, hey, take a 52 year old BMI, 30 Individual with a certain level of activity and uses a hamstring graft.
What is the probability of retear versus someone who uses their quadricep? And the fact that we cannot ask and answer these questions, I think impacts all kinds of issues with health care Which gets you the accessibility of health care which gets the cost of health care, I think data is the path to optimizing outcomes in everything in life.
And the fact that we do not have this most critical data of what's working and , what is the health outcomes associated with different interventions, I think, is right at the core of, Great healthcare. So
Richie Cotton: so just being able to ask questions about data and be able to do data driven decision making does seem incredibly important for anyone, whether you're a doctor, whether you've got your own health concerns or. Yeah, lots of different cases. So this does seem like a very important thing. All right, maybe dive into, what some of these problems you mentioned before are and how we can go about fixing them.
So, you talked about healthcare data being incredibly private. How do you get that level of accessibility while keeping those sort of important data? Personal details about people's health private.
Terry Myerson: which is our healthcare privacy law in the United States, defines the process of de identification, and de identification has two paths under HIPAA. One is called Safe Harbor. With Safe Harbor, you strip out all of the PII, all the personal, all your social security number, birthday, your name, address, all of that gets stripped out, but also all time and all locations. Because time and location are two critical elements of re identification. If you know that a baby was born at a certain location at a certain time, you have a higher probability of re identifying that person. So you have this one path where you strip out all time and location.
The problem is that's really tough for any sort of public health analysis. That's really tough if you're trying to look at outcomes that occur or interventions that occur at one point in time and outcomes. So the other path to find is , that there's a statistical threshold of re identification.
Now this gets much more complicated and it's really an AI problem to say, Okay, we're obviously going to strip out all the PII, but then we're going to start redacting elements of , uniqueness, we're gonna reduce the granularity of time from like down to the second, maybe down to the year, maybe down to the quarter, maybe down to the month.
We're gonna redact location from zip five to zip four, to zip three, to, county or state. But then you also need to consider if it's a rare disease situation, then. There's fewer people similar to this. You need less granular, you may need to redact other details. You know, you may need to take it instead of saying, a specific pancreatic cancer, maybe you say just pancreatic cancer, maybe you say just cancer, and so you're going through a statistical thing, you're creating this.
You want to provide data that has the right, that balances utility and anonymity, but gets to this right statistical level of anonymity. And so you gotta take into consideration all kinds of factors. if you're doing a study which has no need for geography, take out all the geography and that.
You then can include more granularity in other places. If you need a study that's going to have down to the date granularity, I need to know if it's this absolute date of July 1, July 2, July 3, well then you need to have less granularity in location or maybe some of the other quasi identifiers in the data.
And so, we've designed this system that , takes these medical records and the things which are strict identifiers, like your name, the quasi identifiers. You know, your gender, your marital status, things like this. And then you have things which are really not identifying about you and statistically tries that crafts a dataset that maximizes utility. at the threshold of re identification risk that HIPAA requires?
Richie Cotton: , that's absolutely fascinating. Yeah, just the ways people can reconstruct who a person is just from individual bits of information, then maybe matching with some other data. So I'm curious does the amount of detail that you keep in the data set really matter? Do you need to know what the use case is for that data before you craft this partially anonymized
Terry Myerson: We definitely well, we do restrict the use, the data is used for healthcare research. It's used to improve health outcomes. This is not a, you know, we actually strictly forbid the data to be used for any sort of targeting advertising physicians or patients. So it's a, the use case is very clearly about improving health outcomes.
And that's contractually something someone agrees to prior to getting any access to the data. But then we still actually, you know, we have studies being done that link moms and their children. And it's very important for studying maternal health and for studying vaccinations and, what happens.
Now, if you want a linkage between moms and their children, then you're going to take a lot less other details in geography or time or diagnoses, because that's a. You know, knowing this mother has two children instead of one children, again, that's a re identification vector. And so, knowing, , is it a public health study where you need geographic granularity?
Knowing it's a maternal health study that needs moms and their children linked? Knowing it's a you know, you're studying a specific procedure where you need down to the second granularity of what's happening in the operating room. , these are things which, you know, we've got this incredible de identification team and risk analytics team focused on creating the highest utility data sets.
to meet the needs of the study to be done.
Richie Cotton: That's really fascinating that you're going to require different elements from the data depending on what the analysis is related to.
Terry Myerson: to deliver the highest utility data set but also the minimal amount of information because we want to protect the privacy of the underlying patients.
Richie Cotton: Yeah, it seems like, it's, quite a technical art balancing both these things, , providing that
Terry Myerson: It's a, it's one, it's one hard problem.
Richie Cotton: Absolutely. So the second thing you mentioned was around the problem with data silos and data being fragmented and just in different formats and owned by different people.
Can you talk about? think this is something that basically every business struggles with, even if they're whether in health care or not. Can you talk a bit more about what the setup is and how you've dealt with these different silos?
Terry Myerson: So the approach we've taken is to build a consortium. It's a consortium of 30 of the largest health systems across the United States. They provide about 18 percent of the daily clinical care. And so you think about, , the doctor's visits you go to, your family goes to, my family goes to. Across all the United States, the members of our Truvett consortium Collectively provide about 18% of the care.
It's over 800 hospitals, I think 20,000 sites of care. So it's a large corpus of data. It's not a hundred percent, but it's 18%. And so that, you know, it's the broadest, deepest view from the provider's point of view, we then need to compliment that with this data from the payer's point of view, where you see sort of the longitudinal view of the 'cause.
The payer sees all the events. along the continuum. You know, here in Seattle Swedish is a member of Providence Healthcare, which is a member of Truvetta, and Virginia Mason is a member of Common Spirit Healthcare, which is a member of Truvetta. So we see this longitudinal view of when all of my Swedish visits and then all of my Virginia Mason visits.
But if I go to University of Washington Health, I don't see those visits except when we compose in data from the payer. And so we built this consortium, which, where we have full depth data, and then we're supplementing it with third party data feeds. To provide this longitudinal picture, and that's as your, you and your listeners.
It's a lot of data engineering. whereas the prior problem is very statistical and, you know, it involves, , challenges to re-identify versus the de-identified data set. And it's an AI problem. this is very much a hard data engineering problem. It's about. You know, how do you build the incentives for, first of all, I guess it starts with the business model problem.
How do you provide the incentives for people to provide their data? And then how do you do all the data engineering on a daily basis in real time for these heterogeneous data feeds , again, to normalize all of, you know, there's so many different ways, different doctors and different systems are writing out medications and writing out diagnoses and normalizing all that data to one coherent, homogenous, high quality, complete data set.
the scale is the challenge there. It's just that , these fire hoses of information that, you know, we need to incent, secure, and Normalized to be useful for research.
Richie Cotton: I think complaining about different data formats and data being the wrong place. This is something that's like, it's a universal uh, thing that all data practitioners do. So in this case, you've got some interesting business problems, like how do you get data from lots of different organizations?
And then there's data engineering that comes afterwards. So maybe we'll talk about that first bit then. So, did you get everyone to agree to, or how did you incentivize everyone to agree to sharing their data in this
case?
Terry Myerson: there's two different things come to mind answering that question. The first one, there was this moment in time. You got to roll back the clock to 2020, 2021. And we had a global pandemic and there was just We don't have the data we need to take care of the world. And it was front and center in all of our lives.
locked in our houses, our children were going to school. And that created a call to action. That created a call to action to try something new, to do something new. And Truvetta was part of new. And, there is a business model here is that when a health system's de identified data is used by someone else, they're receiving compensation.
And so, a health system makes their data available for this ethical research. Again, there's restrictions, none of it's to be used for sales and marketing. It's to be used to discover health outcomes for patients. And, people are paying to access this de identified data and the providers of the data are receiving compensation for usage of their de identified data.
Richie Cotton: Slightly sad that it took like a global pandemic to get people to realize that data is useful, but I guess in some ways it's a, it's a silver lining.
Terry Myerson: I think people might have known data was useful, but still the idea to, well, let's do something about it to put our data together. With other health systems around the country so that we can really start asking and answering questions faster, quicker. I think the pandemic really, that's what brought me to the problem.
I was, I didn't come out, I wasn't in healthcare, I wasn't focused on healthcare data, but I was a citizen of the world watching television, looking at these tweets and saying, this is a real problem that we can't ask and answer these questions. And so, there was this moment in time where people were open minded to this because it was so necessary.
And the business model was perfect. It's pretty simple. It's, when you know, Pfizer or Moderna, two customers were public about, accessed Truvetta's data percent de identified data, the providers of that data are receiving compensation, frankly, for their effort to provide us the data.
Richie Cotton: Post pandemic, is this attitude that data sharing is a good idea? Is that still happening?
Terry Myerson: But we're still, we're growing. I mean, we're still going. It's um, it's one thing to get started and it's another thing to build momentum as you go. And so I We feel the pressure now to build momentum, to build these virtuous use cases, to show the scientific research that's resulted for this, to show the improvements in safety and effectiveness that come from studying this data.
And I think now if we can show that momentum in terms of patient outcomes, we're super optimistic. but you know, it's all this data engineering's got to come possible. They got to find the, researchers that are able to really use this data for good. And you got to engage them and got to produce the results.
You got to produce the insights. I think with every data project, it's like, where are the insights? And you know, we're starting to see those, and we feel the pressure to keep building momentum.
Richie Cotton: think it's important to note that like it's all about being able to get to some kind of insightful answer or have an impact at the end of it. You also mentioned a lot of data engineering challenges with all this sort of data from different sources. Can you talk me through have you been able to standardize the data at all or
what efforts make the data engineering easier?
Yeah.
Terry Myerson: no, I mean, we have a couple of white papers on our website, and I think our CTO is publishing some blogs on this. But we have, couple hundred engineers now entirely focused on the systems to, we call it the least term normalization, to normalize this data. And it's the, it's taking all this semi structured, completely unstructured terms and. mapping it to ontologies. there's ontologies for diagnoses, there's ontologies for medications, there's ontologies for lab results. Yeah, there's ontologies for cancer staging. And you need to take the semi structured or unstructured data we receive and, translate it to a code.
You know, a numerical code that maps to a well defined ontological term. And that normalization process is something that we're running across billions of data points daily.
Richie Cotton: Okay, so just to make sure I've understood this correctly by this unstructured data you mean like the doctors scribble down some notes about like what a patient's condition is and then mapping to the ontology means like turn those notes into like standard disease names or treatment names, things like that.
Terry Myerson: Yeah, so we use the term you know, there are specific fields for medications and diagnoses in labs where doctors type in strings. They don't go through and like select a specific lab. They go through, I'm looking for an A1C test, or I'm looking for COVID, you know. So that when it goes into a field. For lab medications or diagnosis, we consider that semi structured because we have a clue as to what, what that text is.
And then we take that medication string, and then we have, we break it down into, you know, we normalize the dosage size. We normalize how is it administered. Is it administered intravenously or a pill or is it an injection? We normalize, is it Tylenol? Or is it a brand name or a generic, so we go through that process to break down the semi structured string into units in a database.
Now you then have the doctor's notes, which is where there's incredible insights to be found. That's where you get the side effects. That's where you get the outcomes. That's where you get the symptoms, which led to the diagnosis. And. We consider that completely unstructured, because there, we need to use natural language processing to break apart that note into, you know, when is it a negation or when is it saying, yes, , I did have pain or I didn't have pain.
There's a lot of discussions in your family when you visit. You can talk to your doctor, you know, you describe what your mom or dad, you know, did your brother have something similar? So again, you need to use natural language processing to build a graph of , what is the doctor writing about with regard to these notes.
So the normalization process is sort of critical for the semi structured and the unstructured data. But for the unstructured data, first, we need to break apart the concept map for the documents themselves. And you know, we just, Did some work in seizure frequency and it's just, to study his outcomes for various medications for People having symptoms of seizures and it's just fascinating like this NLP, there's no Structured or semi structured data.
The doctor is saying so you got these notes have Terry's seizures, you know they were nightly now they're weekly, you know And all the various ways they can write nightly once a day, each night , at the end of the day, it comes down to a frequency of once per day.
, but then you got to interpret that's the current state. And of course, then there was this. And the pros, they had, it used to be weekly, and there's a thousand different ways to say weekly, but you need to have this ontology for frequency of daily, weekly, hourly, provide a link back to, this intervention or has a certain frequency at a certain time stamp.
Richie Cotton: certainly human language, particularly English, there's many different ways of saying things. And I think A few years ago, that would have been a terrifying problem to solve. I'm now wondering whether with the advances of generative AI, this has become an easier task for you.
Terry Myerson: open minded as, which models we use in which circumstances, and we're just learning, trying to learn with each of them.
Those are essential ingredients that are making the job. more tractable, but they're really just an ingredient. I mean, so much of this is about building the right training datasets that are very targeted and focused because, , our bar, you know, the FDA has published guidance on what they need to accept evidence for their, and that regulatory grade bar and providing that quantitative evidence on our training results.
You know, our training data and the outcome of our model versus the training data and maintaining both training sets and evaluation sets that are similarly annotated so we can compare the results of the model to that's the system building for this. And these. Large language models absolutely help.
and, the fact that they're all competing to get better and better is great.
Terry Myerson: Their ingredient in the solution.
Richie Cotton: One of the questions I've got is, I've had recently is When everyone's using the same large language models, then how do you get that competitive advantage? And it sounds like a lot of the value of your work comes from, it's not just using the generative AI tools. It's actually about cleaning the data, having high quality data sets and then just doing the data engineering right.
And that sort of combination of data engineering and AI seems to be adding the value there. Is that about right?
Terry Myerson: Well, there's a tremendous amount of data engineering but I think to create this data to be structured and normalized, but, at least in healthcare research, we use regulatory grade as the bar, but building those training data sets that are, , created by clinical experts using that training data and dividing that training data into having evaluation data.
So we can have measurable, we can have, you know, tell you the precision recall of the model that produced this data. That transparency on the model outcomes and accountability to the modeling, that's the bar we're seeking and the bar the regulatory's regulators hold us to. And so that regulatory grade. which uses as an ingredient, other AI, I think , every industry is gonna have their own challenges. But these tools, which are kind of all black boxes in their own way, you gotta create a framework around that to say, what is the quality bar you are seeking?
What is the either fine tuning or training data you're going to use? How are you going to measure the results? if you can just accept it took a bunch of text, wrote some new text, and you're using it as marketing content or persuasive content, I think that's different than saying, let's go measure the outcome of this health intervention.
For that, we need more scientific rigor. And I think these models are amazing and can be an ingredient in the system, but you need to create a framework of accountability around these models.
Richie Cotton: You mentioned the phrase regulatory grade, so how can you measure the performance of these things or how do you measure the performance to make sure that it is suitable for regulators to audit?
Terry Myerson: let's just create a simple example of, we create a training set of data, you know, human annotators. We divide that set of annotated data between a training set and evaluation set. And so you feed the training set in as fine tuning data. You then compare the outcome against the evaluation set and you provide transparency to what was the results?
Was the model accurate? Did it get it? How many false positives? How many false negatives were there in the evaluation set? And then you, what we do is we invest in creating more training data or more fine tuning data until we receive an acceptable level of fine, false positives, problems, false negatives for the study to be done.
But this is a ongoing process to create more high quality data to fine tune or train the model. And then more. measurement apparatus against, well, how many false negatives or false positives is it producing on a particular question?
Richie Cotton: so It sounds like a lot of the sort of standard tools for machine learning for just evaluating whether a model is good or not, they sort of work in your context as well when
Terry Myerson: Well, we're providing transparency to those measurements and we're investing in this creation of high quality training and evaluation data. And that's what we do. We build systems, you know, and then there's all this data engineering to continuously rev the versions of the models to continuously, glue together the extraction models with normalization models and all that data engineering stuff.
But I do think the investment in high quality training and evaluation data sets and then high quality measurement systems that provide transparency to all the constituencies is going to be key to certainly any regulated industry using AI. And I certainly think healthcare. Part of it, yeah.
Richie Cotton: certainly the idea of having models that are reproducible and there's some documentation on how they work and things like that just seem incredibly important. All right. So I'd like to talk a little bit about impact. So do you have any success stories about new questions that you can answer more quickly or some sort of healthcare outcome successes.
Terry Myerson: So the one that made it to the today show, which was kind of fun. It's fun, but it's also meaningful. There's these weight loss drugs that the one that's certainly very well known, Monjaro from Eli Lilly is also very well known, and there's a clinical trial that is being underdone right now with, I think, 800 patients.
And that clinical trial is supposed to read out middle of next year. And it's a comparative effectiveness study. And it's clinical trial is obviously real scientific rigor, and it will publish in 2025. But there's this question, like, okay, what about, there's, I actually don't know the number, tens of thousands, hundreds of thousands of patients taking these medications now to address obesity.
so why can't we use the data now to look at the relative comparative effectiveness? And so we did, we looked at the data and we saw that there was a very distinct outcome with regard to weight loss, Mojaro versus Ozempic. And we showed that like, you don't need to wait a year and a half to run a very finite clinical trial.
We're talking about 10 times as many patients available now in the data. So that's that was one that has gotten a lot of attention. There's also been more targeted studies that I think all these studies that they all come down to the human that faces the situation and the choice. But there was a study of two different stents, and it found that one of the stents was creating these major bleed events.
Terry Myerson: more than the other one. And it's just interesting because without this data, you're just, the health system is just going to use the cheaper stent, one stent, another stent, but you need data to say, there's horrible outcomes coming from this stent. we recreated one study, actually, another study is related to a heart valve that was you got a heart valve.
placed in you, it's supposed to last 10 years. there was this one heart valve that was showing this pattern of this artificial valve, mechanical valve, that was only lasting one year. And like, without this clear data to look at, it got through the system, it got through the clinical trials, it got approved, but how do you monitor this?
And so, to actually see that this heart valve was failing or to see that the stent was causing a major bleed, or to see that, one of these weight loss medications has a statistically significant, much better outcome for weight loss. You know, these are just examples of but, you know, we have,, Moderna just gave a talk with a member of our team at a trade show in Atlanta, talking about studying rare disease patients in the sense that, , to study rare diseases, you need a lot of You need incredible scale because some rare diseases have, you know, it's a needle in a haystack problem, some rare diseases, there's only, 50, 75 people across the whole country that have some of these conditions.
And so only with a data set, this can you start to. look at patterns. You may be the only person with that condition your doctor has ever seen. So, wouldn't it help that person, help that physician to look across patterns that have worked or not in the past. those are just some of the examples. I find them all very exciting actually.
Richie Cotton: that's absolutely fascinating. Just sound like there are just so many different possible healthcare use cases here. There's just a lot of unanswered questions in healthcare at the moment. So this does seem like an incredibly important thing.
Terry Myerson: it is, I mean, it's just, the way healthcare learns today is historically something called a clinical trial, which offers incredible scientific rigor. but they are slow and expensive and they're not iterative. when I think about how we all learn in our, that iterative experience of just asking questions, being able to visualize in the right way that leads to the next question and that leads to the next question, that leads to the next question, that iterative, continuous, 3 5 5 Learning is something that's really not available for healthcare today, and I think it will have a profound impact as it becomes available
Richie Cotton: You mentioned the idea that Clinical trials, they're scientifically rigorous, but they are so inexpensive. Do you see having just this public data as a replacement for clinical trials? Or is this something that would be as well as clinical trials?
Terry Myerson: I think this kind of real world data will definitely replace some clinical trials. It will not replace all clinical trials. I mean, there's no replacement, I think, for trying an intervention out on a real human being and saying, is it safe or effective? But we need as many placebos. people in the clinical trial.
, can we compare an intervention to a standard of care that already exists? Once a drug or device has been approved in the market, you then move into what's sometimes called phase four clinical trials, which is monitoring the safety and effectiveness in the real world. Well, we can do that with data.
And so I think the phase one clinical trials, I don't think there's any impact. I think we know what it will be able to find patients faster. That's sort of, we're looking for patients with specific conditions. We can't find this. We'll be able to find patients for phase one clinical trials faster as we move into the large scale phase threes I think how we use synthetic control arms versus placebo arms I think there's gonna be a shift there over time and I do think Once a new drug or therapy or device is released into the market, I think we're going to see that entirely shift to data driven monitoring.
But that's my, I'm obviously, I'm a believer. , just believe this can have a huge impact. But certainly the phase one interventional arm is going to stay. We'll just find those patients faster. Phase three, placebo versus synthetic data arm. I think there's going to be a evolution there. And phase four can be all data.
So. It's quite a big change. It's just it doesn't replace all clinical trials. It just requires us as a society to lean into how to embrace data, how to embrace big data in this incredibly important process that is the core of healthcare alarms.
Richie Cotton: All right. So I suppose that makes sense that the later you are in the process, the more data you have already. So the more effective this, these data sources are going to be. And as well as health outcomes, I suppose the other aspect is monetary outcomes. Certainly having moved to the US it seems like people complain about the cost of healthcare.
It's very much a national sport. So, are there going to be any implications for the price of healthcare due to this better use of data?
Terry Myerson: Well, I think knowing which interventions will have the best outcomes. We'll have a profound impact on which interventions are done and which will have a profound impact on cost of providing healthcare. but I feel like that's an implication of us providing all this data and great for great health outcomes research. we're focused on , which. Interventions have the best outcomes and then providing access to that care and how that care is reimbursed or, my brain's full with my, our scope and the economics of the system is not something I'm spending a lot of time thinking about, which it's just a practical reality for me.
I'd like to pontificate about it with you, but I'm just not.
Richie Cotton: Okay. All right. So, I'll get a health economists on the show. We do another episode.
Terry Myerson: I mean, broadly pontificate. I can be,, I can help you ask questions of somebody, but I'm not, I won't be the big answerer for that, whole topic of accessibility to care and cost of care.
Richie Cotton: Okay. , in terms of access to this data, who should care about it? Is it just health researchers or doctors or can individuals find out or answer useful questions about themselves by having access to this sort of data?
Terry Myerson: Well, today the company's really geared for what I would call a B2B engagement with, you know, we have schools, government agencies, the CDC is a great customer. And, pharmaceutical companies, health care systems were not yet geared for the citizen scientist or the individual, but it's something I would like to do.
And that's, it's kind of an excuse, Richie, but it's more like it just takes time to build a company and build capabilities to engage and support an individual. And I'd love to help every clinician worldwide be an expert.
I'd love to help every family make more informed decisions about their care. It's just on our company building journey, that's not a 2024 thing, maybe 2025. It's just pragmatic realities of providing the interface and the tools and making them available to an individual versus a, school, CDC, Pfizer, you know, it's just a different, it's a different emotion for our company.
And we're a small company and we're growing fast and we're trying hard to delight the customers., we do have
Richie Cotton: so, for people in the healthcare industry for now, and then maybe I'll get you back on the show in a couple of years and we'll talk about the opportunities for individuals.
Terry Myerson: absolutely. Yeah. This is a citizen scientist. We need, probably six, at least six. I mean, I know what our plans are for the next six months. I don't think it's going to make it.
So now you know, I hope we, I hope we're there,
Richie Cotton: . So, when we talk about the different areas where all this data can be used, you came up with lots of different examples. Are you think there are any areas that are higher priority for making use of all this data?
Terry Myerson: I want to say no. You know, I've had cancer in my family. I've had knee surgeries in my family. One of my work colleagues has a daughter with a rare disease and we very much. are thinking about this as building a platform that can think about any drug disease or device. And we're just not thinking about this like, Oh, the biggest opportunity is cardiovascular or the biggest opportunity is cancer or the biggest opportunity or the most important area is rare diseases.
We're thinking about this horizontally because I think health is so personal and every family is different. and I actually think it's all connected. I don't know this, but like, It could use this fancy term comorbidities, but I mean, , it's a giant graph of like, the connection between all these symptoms and diagnoses and lifestyles and genetics and all these things.
I think we're going to learn more by thinking about it horizontally and connecting all this data together so that all the questions can get asked and hopefully answered in a coherent way. So no, we're not prioritizing one area over another. And we're thinking very horizontally.
Richie Cotton: Are we at the point now where you can? Ask almost any healthcare question you want, or is there still more data to be collected or more work to be done in terms of being able to get to that point where you can do data driven decision making on any question?
Terry Myerson: You know, I want to say, you can ask any question will you get the complete answer? I mean, there is data that we would like to onboard to the system. We don't have integrated. There's amazing amount of questions you can ask or answer, but. No, I mean, there's just data and you're going to say which data and I'm not going to answer, but there is data that we would like to on board.
We don't have today on board and we're working hard at it and we're, following the signal. We're following the, what are the types of questions, how do we facilitate drug discovery? How do we facilitate, health economics with health outcomes research to our prior conversation?
I mean, these are some of the questions we think about just providing a solution to the customer. And. We're working hard at it.
Richie Cotton: Alright, so just to wrap what are you most excited about in the world of healthcare research and data then?
Terry Myerson: Oh gosh, I think I go back to the founding of the company Richie and I think about how you know, as a software engineer, we learn every day based upon the telemetry of how our systems are being used. And if you drive a Tesla, Tesla is learning every day about the bumps in the roads and what you do as you drive that car.
And if you watch Netflix, Netflix is learning every day about what movies you're watching and what's happening. And there's this incredible rate of learning that goes on in all those systems. And then when I think about. All these professionals in the health care system or the life science, they don't have that real time learning.
And this is such an important part of our world, our lives, our economy. And I'm just excited to try and bring that kind of real time learning and iteration to this whole industry. I just think they deserve it. So they can just do it. think they will innovate faster, take better care of us. And I'm just, like all of your listeners, I'm just a believer in data having this profound impact on how we do our jobs.
And I'm personally just so passionate about bringing that data to this whole industry.
Richie Cotton: It's a very nice story, just the idea that just a ton of data engineering and data cleaning can, you know, Improve health care outcomes. So yeah, that's a fantastic stuff. All right. Thank you for your time Terry
Terry Myerson: Absolutely. Thanks for having me, Richie.
podcast
Using AI to Improve Data Quality in Healthcare
podcast
Data & AI Trends in 2024, with Tom Tunguz, General Partner at Theory Ventures
podcast
Monetizing Data & AI with Vin Vashishta, Founder & AI Advisor at V Squared, & Tiffany Perkins-Munn, MD & Head of Data & Analytics at JPMC
podcast
Data & AI at Tesco with Venkat Raghavan, Director of Analytics and Science at Tesco
podcast
How this Accenture CDO is Navigating the AI Revolution
podcast