AHLA's Speaking of Health Law

AI in Health Care: Managing Algorithmic Bias and Fairness

AHLA Podcasts

Brad M. Thompson, Partner, Epstein Becker & Green PC, Chris Provan, Managing Director & Senior Principal Data Scientist, Mosaic Data Science, and Sam Tyner-Monroe, Ph.D., Managing Director of Responsible AI, DLA Piper LLP (US), discuss how to analyze and mitigate the risk of bias in artificial intelligence through the lens of data science. They cover HHS’ Section 1557 Final Rule as it pertains to algorithmic bias, examples of biased algorithms, the role of proxies, stratification of algorithms by risk, how to test for biased algorithms, how compliance programs can be adapted to meet the unique needs of algorithmic bias, the NIST Risk Management Framework, whether it’s possible to ever get rid of bias, and how explainability and transparency can mitigate bias. Brad, Chris, and Sam spoke about this topic at AHLA’s 2024 Complexities of AI in Health Care in Chicago, IL.

To learn more about AHLA and the educational resources available to the health law community, visit americanhealthlaw.org.

Speaker 1:

<silence>

Speaker 2:

This episode of A HLA speaking of health law is brought to you by A HLA members and donors like you. For more information, visit american health law.org.

Speaker 3:

Hello, this is , uh, Brad Thompson. I'm a partner at , uh, the law firm of Epstein Becker and Green, and I'm, I'm delighted today to be joined by two colleagues who, together, the three of us , uh, did a presentation , um, in Chicago at the A HLA conference on the use of AI in healthcare. And apparently , uh, enough people were interested that they asked us to , uh, to explore the topic even further in this, in this podcast. So we're, we're delighted , uh, to do that . Uh , we're gonna cover basically how you analyze the risk of bias and how you mitigate the risk of bias, but sort of look through the lens of a data scientist. And so, my two colleagues today are both data scientists and , um, I'd like them to introduce themselves. Uh, we've got , uh, Chris Provin and , uh, Dr. Sam Tyner Moore . Um, and I'd love to hear not just a little bit about your background, but maybe help people understand what a data scientist is, what that even means. So, Chris, maybe could you go first?

Speaker 4:

Yeah, absolutely. Thanks, Brad. Um, so my name is Chris, proven . I'm the managing director for Mo Mosaic Data Science. Uh , we are a small consultancy, boutique consultancy that , um, builds , uh, AI driven tools , um, to help , um, optimize and improve operations across a wide variety of industries. And so that ranges from healthcare to retail, to industrial manufacturing, to , um, energy and utilities. Um, and so my, my own background , um, I was , uh, trained in operations research. Um, and so I was really focused on how to , um, optimize complex systems. And in that world, I spent a lot of my time actually building models of how these systems operates based off of first principles. So looking at , um, for example, how , uh, inventory flows through supply chain and what the different impacts are of, of delays and , um, uh, capacity issues throughout that system. But I was really developing models of the system itself. Um, I have moved my career into the world of data science and have been a data scientist for about the last 10 years. And, and the real difference there , um, is that , uh, I , I focus more on, on building models of the relationships and the data that can then help inform what's happening in the real world and how we can predict things that are happening or, or optimize those things. Um, so as a data scientist, the way I would define it is that , um, we are , uh, using , uh, data obviously, and then , um, tools like statistical models and machine learning , uh, models to , um, be able to extract real actionable insights from that data. And then working with the , um, the operational users of our tools or the , um, the , uh, uh, you know, clinicians and other folks to figure out how to actually , uh, use that information in a meaningful way to, to make decisions, to automate decisions, to, to , uh, positively impact the , the outcomes to their patients, to their customers, to their, to their businesses.

Speaker 3:

Terrific. Thank you. Uh, Sam, do you wanna introduce yourself?

Speaker 5:

Thanks, Brad. I am Sam Tyner Monroe . I am the managing Director of Responsible AI at DLA Piper. I recently joined the group in the last 18 months or so, as part of the , uh, AI and data analytics group. And that group, we help our clients with their AI governance. We help test their AI models, and we also help red team their generative AI models. And as , uh, a little bit of background for me, I am a , I have a PhD in statistics from Iowa State University. Uh, previously I have worked , um, at the intersection of law and data science, or data and statistics for the last several years. So I started that in graduate school at the Center for Statistics and Applications and forensic evidence where I was looking into , um, the forensic sciences. And then I subsequently worked , um, for the Bureau of Labor Statistics, and then I transitioned to becoming a data scientist , um, in the last several years , uh, as I have worked , um, again at law firms helping clients test their models for AI bias.

Speaker 3:

Thanks, Sam. So , um, before we get into kind of how you analyze , uh, bias and what you do about it, I thought I'd give just a couple minute update on , um, some new regulations that have come out at , uh, at HHS. So , um, it's not a regulation, but , uh, FDA , um, uh, has become more and more interested in looking at bias in connection with new medical devices that are , uh, AI driven . And so they've come out with , uh, several guidance documents of relevance, but basically their position is that a , that a , um, uh, AI driven medical device that , um, uh, uh, has a lower , uh, accuracy for a given subpopulation like , uh, black women, for example , uh, is not safe and effective for black women. Um, elsewhere around HHS , uh, we've got the office of the national coordinator that earlier this year in January, came out with a new rule that requires developers to provide a lot more information to users about potential bias so that the users can make more informed decisions. So this includes providing what are called source attributes , um, many of which , uh, relate to or, or are useful to analyzing whether a , an algorithm has a bias or not. Um, and then they have to do risk management , uh, which they also have to make publicly available through a hyperlink. But the big rule came out , um, just days before our May , uh, meeting when we were together in Chicago at the A HLA , um, AI Conference. And that's a rule from the Office of Civil Rights at HHS , um, that goes into effect 300 days after it was published. So that's gonna put it just about in the beginning of March, 2025. The rule is actually very short. Um , it's just , uh, basically three sentences. The first sentence , uh, is a general prohibition that tracks the statute pretty closely. It just says a covered entity. And most of you probably know what a covered entity is in healthcare regulation. Typically, that would be a , a provider of some sort or a payer of some sort. Uh , must not discriminate on the basis of race, color , national origin, sex, age, disability, and so forth in healthcare programs through the use of a patient decision support tool. And there's another regulation that defines what that is. I won't read it all to you. It's, it's not that interesting. Uh, but basically it's any algorithm that's used in delivering , uh, patient care on the clinical side. So administrative stuff like scheduling would not be , um, swept within this rule, but anything that sort of affects the care that , uh, that patients get. So there's a general prohibition that's not new in a sense that it's based on , uh, section 1557 in the Affordable Care Act. It's just an interpretation of that. What is new though, are the next two sentences. The first one , um, that places a burden on these users. So again, we're talking about payers or providers , uh, a duty to make reasonable efforts to identify , uh, these tools , um, that, that may employ , uh, these , uh, sensitive , uh, factors in making decisions. So the vigilance required is not vigilance for bias per se, but the risk of bias by the use of patient sensitive information. So, all , uh, covered entities need to be vigilant now in looking for algorithms that they use that, that , that employ or, or, or are based on or, or , um, make use of these sensitive attributes. And then finally, if you find such , uh, an algorithm, you're supposed to mitigate , um, the risk, and, and we'll talk about that in just a second. So the duty to identify the risk , um, the, the preamble talks about it. It basically says, look, if, if you knew or should know that a tool could not does, but could result in discrimination, then you need to use the risk management tool. So, so what you're looking for, again, is not bias necessarily. You're looking for the risk of bias. And then , uh, in the federal register notice, they give a whole slew of, of , uh, publicly available information sources, like the popular media, trade media , uh, rule makings bulletins from HHS , uh, medical journal , uh, peer reviewed medical journal articles and so forth. So they really expect you to survey the landscape, identify those , um, uh, uh, uses of algorithms that could , uh, result in bias and then employ risk mitigation. When it comes to enforcement , uh, they're pretty plain in saying, look, we're gonna hold , uh, larger organizations to a higher standard. They have more resources, so more will be expected of them. We're going to , uh, scrutinize , uh, organizations that use these tools off-label. Um , because if you're using it off-label, you really can't rely on the developer to have evaluated this , uh, because they didn't have that use in mind. Uh, they're gonna look for certainly actual knowledge of where someone in your organization, for example , um, knew that there was a problem. And they're gonna look for compliance programs. Um, and so that takes us to mitigation. Um, in mitigation, they expect you to, to use a classic compliance program that has written policies and procedures that are, that are established , uh, governance measures for making decisions, monitoring potential impacts, training staff, all the normal stuff that goes into compliance program. They also reference something called , um, the , um, uh, uh, artificial intelligence risk management framework that NIST put out, which is a framework , um, uh, specific to AI and evaluating the risk in ai. So that's what's kind of going on, and that's what's driving this interest in the topic. And so I'd like to , uh, return to my panelists and, and , um, and , and start the conversation. So, you know, it would be helpful , um, Sam, maybe if you'd be willing to start us off by giving an example or two of what a biased algorithm is, just so that we can have something in our mind's eye as we go through this discussion. Um, what , what would a biased algorithm look like? Can , uh, can you think of any examples, hypothetical or real?

Speaker 5:

Yeah, absolutely. There are several real examples of biased , uh, AI systems or algorithms in the healthcare space. Uh , one example that comes to mind is in 2019 , uh, obermeyer etal discussed , uh, racial biases that were present in a care management system algorithm, which led to black patients with greater need of receiving less care than sim similarly situated white patients by the same system. And so this bias they found was caused by the way the model was designed, healthcare cost was used as a proxy for healthcare need, which then led patients with high need, but who had lower costs because they didn't necessarily have the ability or means to obtain the care being given less care in the end. So that's one instance of how , uh, algorithmic bias has arisen in the healthcare space. Uh , another example, there are several recent , uh, suits alleging against , uh, health insurers that they are using algorithms designed to predict one thing , um, to do something else. And so in this, in one instance, this is a, an algorithm which is designed to predict length of stay, and it's , uh, supposed to help doctors and, and healthcare providers determine how long someone might need to, to be monitored for. Um, but instead, it's being used by the insurers to deny coverage past a certain date. So if a doctor says, you need to be in this nursing home rehab facility for 21 days, but this model has predicted you need to be in the, in the system for only 16 days, then the, then the insurer will deny coverage for those additional five days your doctor has said. And so this has alleged , um, again, it , it's a alleged that it's resulted in denial of care. Um, and so the, the bias in this case , um, because comes because the model was used in an unintended context that then resulted in bias against patients with higher needs.

Speaker 3:

Very helpful. Thank you. Um, Sam, so one of the things that the federal Register notice talks a lot about is proxies. Um, and, and so proxies , uh, come into play when deciding, you know, what the risk is and looking for risk. Chris, can you explain this notion of proxies? What does that mean and what's its relevance here?

Speaker 4:

Sure. Yeah. So a proxy is effectively a , um, variable or, or a set of multiple variables that are , uh, closely correlated enough with , uh, a particular group identity. So , um, a , a racial group , um, a socioeconomic group , um, that by using those variables, you can approximately predict , uh, a person's and individual's membership in that group, right? So by , um, using those variables , uh, you have a pretty good sense with , with some high level of confidence of whether this is somebody who's in an underrepresented , um, racial minority or a, a , uh, low income group, things like that. So, you know, as an example , uh, one that comes up quite a bit, zip code , um, you know, we have zip codes are, are just historically segregated , um, by , um, by race, by , um, income. And so those zip code as a variable becomes a proxy for those different groups. And so if you are just going through your process in , um, building one of these AI driven tools, and you're trying to, you know, take out the, the group identifications that , uh, the variables that would actually directly say, Hey, this is somebody who belongs to this racial minority, or , um, has these characteristics, zip code, if that still is in , um, in, in the , the set of variables that becomes , uh, an approximate indication of somebody being in one of those groups that, that you're worried about , um, being disadvantaged by one of these models. Um, but it's not as clear and clean as that. Um, so , uh, Brad, in our discussion that , uh, A-H-A-H-L-A , excuse me, conference , um, you mentioned that , um, there have been studies that show that , uh, with just six clinical indicators, you can , um, with fairly high confidence, predict the , uh, the race of a patient , um, which is kind of an amazing thing when you think about that. And , um, what makes that really then difficult is that while those clinical indicators are probably very important clinical indicators for , uh, determining different diagnoses or risks for , um, adverse outcomes, things like that. So it's not as easy as just being able to say, let's just wipe out all our proxies and not use those. Um, uh, they, they often are very important and critical information for these tools to be able to use.

Speaker 3:

Yeah, what you're referencing is actually just vital signs, things like heart rate , six, six vital signs , uh, and you can , uh, estimate someone's , uh, uh, race. So , um, it sounds to me like it's kind of tough to discern where the risk is, because an awful lot of algorithms might use , uh, not just demographic information, but , uh, vital signs. So Sam, when you're faced with that, and you're a , a healthcare provider hospital, for example, and you put into place a compliance program, and the compliance program says you need to stratify , uh, by risk so that we can focus on the higher risk and spend less time on the lower risk, how do you do that in this context? How do you stratify the algorithms that you might be using?

Speaker 5:

That's a great question, and it's not an easy question either. There are several. Um, there are several ways to classify risk , uh, especially specifically risk of bias , um, or the overall risk of a , of an AI system more broadly. Um , there's, you know, the EU AI Act, which does a , a limited moderate high and forbidden , uh, risk levels of AI systems . So that's a system that, that the , um, that the developer or user of , of these systems could, could use. We have a methodology we've developed , uh, which we call our bias risk matrix, and we use that with our clients to determine the risk of bias for a particular algorithm or AI system. Uh, we, we've used it for a variety of client systems in a variety of industries. So not just healthcare, but as a part of this bias risk matrix, we consider several elements of the inputs, the modeling choices, and the outputs of the system when determining the level of bias risk posed by the system. So, for example, we consider the appropriateness of the training data, and the, we we consider this because the likelihood of bias increases when training data is, for example, fairly homogenous , uh, so that, you know, you're not, you're not getting a big diversity of representation in the training data or when it doesn't represent the population of use. So if you're using it on, you know, if it was trained on elderly patients and you're using it on people in their forties , uh, or if there, there's also , uh, a question about data appropriateness when there's high potential for the presence of historical biases. So that could include the proxy variables such as zip code that Chris was talking about previously, but that could also include things like historical biases and , and medicine such as race or gender bias , um, and pain management. We also consider the level of complexity of the algorithm itself. So the more quote unquote black box a model is, there's more bias risk because it has the ability to pick up on patterns that are unseen by humans, but also having a human in the loop is a really important decision of determining the AI bias risk posed by a system. So when humans have less or no control over the final decision, there's a higher risk of bias because of a human expert can't step in. This scale of biases increases greatly. And so this , these are just a few of the several factors that we consider , um, when we're looking at how to assess the unintended bias risks of a, of an algorithm.

Speaker 3:

Um, thanks, Sam. So , uh, I mentioned , um, earlier that, that this new rule requires you to be vigilant. And in the preamble it said, you know, you could read the popular media and the scientific media and federal regulatory developments. These are all things, honestly, that an English major could do. It's, it's a literature search, right? So lots of different literature , um, on this topic. Um , but Chris, as a data scientist, are the things that you would do inside an institution, for example, can you test algorithms? Can you test them for bias? What, what tools are in your tool bag , um, to , to look for , um, biased algorithms or potentially biased algorithms? Yeah , I mean ,

Speaker 4:

I , I would even step back first before talking about the , the technical tools and talk about some of the, the governance issues that go along with that, that , um, you know, in addition to , um, going back and doing the literature review, I think it's also really important to have , uh, different voices weighing in on these , um, assessments of, of the risk of bias. Um, so, you know, that includes , um, having , uh, clinicians as well as, you know, data scientists and , uh, patient advocates really important. And of course, members of , um, some of the historically disadvantaged groups that may , in the end , um, you know, the , the ones that are , are, are facing the, the negative impacts of the potential bias of these tools. Um, and so having, as part of that as well, to go along with the literature review to really start to detail where might bias potentially be coming in , um, that as a data scientist, I may not immediately , um, it may not immediately come to mind for me. Um, I think one of the, the big risks in all of this is letting the data scientists be , um, the ones to determine how we're going to actually, and where we're going to actually be looking , um, for bias. Um, but with that information that can start to point us towards the different places in the , uh, the, the model lifecycle where we can be looking for bias , um, the ability to do that, it's gonna differ from case to case, but we can be looking at , um, things like going back to the, the source data and , uh, the training data that's being used for , um, the models and evaluating whether we have , um, sufficient representation of different groups within that data. Um, that can be , um, to doing measures of , um, the actual relative performance of these models. So if we're talking about a predictive model, how accurate is that model , um, in predicting , um, you know, diagnoses or whatever those target variables are for , um, different , um, uh, different groups. Uh , and looking at those ratios between those, how much more or less accurate are we , uh, in predicting between those different groups? And I think it's really important also then to put all of this in the overall care context. When you do that, that ultimately the , the measure of equity , um, is going to be based on, on healthcare outcomes. And so, you know, while oftentimes we can focus in on the , um, the, the measures of bias for a given model, we also need to be looking at the measures of equity in the , um, in those outcomes. And so that includes looking at, do we have within our patient population the ability to look at , um, baseline levels of equity? How , uh, how well are we doing right now in terms of , um, uh, you know , uh, treating these different groups , um, uh, equitably and fairly , um, in terms of the, the actual treatments they're receiving, as well as the, the health outcomes. Um, and then as you're rolling out these tools, being really vigilant on being able to measure , uh, and monitor over time , um, that relative performance of, of , um, of the clinicians when they're using these tools , um, and how that's impacting them , and ultimately the , uh, the patient outcomes.

Speaker 3:

Well, Chris got into my next topic just a little bit, which is, you know, most of our audience I think are probably attorneys, and if they're healthcare attorneys, they almost certainly are familiar with compliance programs. Compliance programs are very important in the healthcare industry 'cause you have so many complex regulations and and so forth. Um, but let's talk about what's maybe unique about , uh, a compliance program that is focused on algorithms and sam, are there aspects of a compliance program that really need to be tweaked or, or adapted or, or , um, tailored to the unique needs of, of , um, overseeing this risk of bias?

Speaker 5:

Yeah, absolutely. The risk of bias really is something that, as you mentioned, the NIST AI risk management framework, that is a part of overall risk management and as , as part of an overall AI governance and ai , um, compliance system. Um, but one thing that is unique is the amount of data that is needed to test for bias. And so one thing that needs to happen as part of compliance is having conversations around what biases the organization needs to test for legally and also wants to test for from a business perspective or for their customer base. Um, and so in the case of section 1557, it's very explicit about what you have to, to , uh, or what just what types of discrimination are not are prohibited. And so you have to look at race, color, national origin, sex, age, and disability. Those are all specifically called out. Uh, and therefore, to test for bias and discrimination in these categories, you have to have data in these categories as well. Uh , and because this requires collecting very sensitive data, there should also be governance policies around access to that data and linkages between data sets . So, for example, it might be appropriate to keep the protected class information siloed from model developers. So they can't use it in model and algorithm development, but it should also, we should also make sure that it's available to model testers when they're testing for the bias. And there may also be regulations around, you know, privacy data or , or , or there may also be regulations around privacy and, you know, health protected health information data. Um, so you may not be able to store, you know, protected class information alongside PII or PHI. So that should also be a part of compliance. Um, there's a lot of pieces that go into compliance, but one thing that I think is a little bit different for , um, bias specifically is that the periodic testing and monitoring, so not, it's not novel , um, ongoing monitoring is not novel , um, in, in compliance. But , um, with ai, and it's kind of a , a novel concept for most people. It's important to note that one bias test at one point in time is not enough. So they need to be processes in place to monitor model performance over time, as well as triggers that are in place. Um, which when triggered you got , you have to conduct additional testing. And there also need to be performance thresholds determined, and conversations around what actions to take if those thresholds are exceeded. Um, so that, you know, if continuous testing, for example, shows an uptake in bias, there are discrete actions that the organization will take to mitigate it. Or in extreme cases, they may even decommission the use of the system entirely. And so there's a lot, a lot to consider here , um, with the, with the bias of AI systems.

Speaker 3:

Um, thank you Sam. Um, Chris, Sam mentioned this , uh, risk , uh, this NIST risk management framework. Can you go a little bit deeper into that? What is that and is it useful? I mean, is it something that , uh, organizations can actually , um, employ?

Speaker 4:

Sure. Yeah , absolutely. So the , uh, the NIST AI RMF , it's , um, a , uh, uh, uh, a , a a document and a , a set of , um, related , um, uh, guidelines for, for , um, implementation , um, that it , it's focused on , um, ai, the risk of using , uh, and deploying AI more broadly. And so that means, number one, it's, it's, you know , cross industry, but also it is focused on a , a broader conception of, of risk that goes , um, beyond just bias, but , um, also looking at just , um, general , uh, you know, safety of AI systems, reliability, privacy concerns, things like that. So it's a much broader framework. Um, it's intended to provide a common language around AI risk and AI risk mitigation. It is , um, uh, intended to , um, to help with, with setting up the governance and processes around that. It's entirely voluntary. It's not a prescriptive , uh, framework. Um, but it does provide a number of different , uh, essentially a menu of , um, different , uh, risk mitigation practices. You can start to choose from and , and form your own , um, your own , uh, processes around that. So it , it's, it's useful in the sense that I think it provides the, that common line language and the starting point for determining that. But it , um, in order to actually implement it actually requires , um, a , a good amount of work to go through the whole , uh, set of potential processes, select what's actually relevant for your particular organization, and then for that particular , um, uh, application. Um, and so I, I think it's better thought of as a, a governance framework rather than a, a true , um, process and implementation framework. There are other frameworks out there, one that I've referenced a few times that actually I , I first learned about at the , um, A HLA conference. It had been , um, just published , uh, shortly before that conference, but it's the health equity across AI lifecycle or heal , um, framework. And that was put out by the health AI partnership that is much more targeted towards , uh, towards healthcare applications and life science applications. And , uh, in particular really zeros in on the bias and equity concerns there. It's a much more prescriptive framework. Um, and it provides , uh, direct guidance on , um, who are the different , uh, stakeholders that need to be involved in different st uh, steps and decisions along the way. Um, and really gives you a , a , a fairly , um, direct set of , uh, questions that need to be answered, analysis that need to be performed, discussions that need to be had , um, that , uh, that are really relevant to , um, both the builders of, of AI healthcare systems as well as the , uh, the, the practitioners, the clinicians and, and the organizations that are , um, that are using these tools. And so , uh, it , it's aligned with the NIST RMF , um, but it takes that down a level to really start to lay out a roadmap of here's exactly how you do this in a way that is compliant with RMF, but, but , um, uh, gives you a lot , um, more direct a roadmap for implementing it.

Speaker 3:

Very interesting , uh, and useful. Thank you. Um, Chris , um, Sam, we talk a lot about mitigating risk. That's what the rule talks about, mitigating risk. Why doesn't the rule just say, you know, get rid of risk? Is it possible to get rid of risk?

Speaker 5:

No, it's not, it's not possible to get rid of risk. And specifically in this context, we're talking about risk of bias, right? And I don't believe it's ever going to be possible to completely eliminate all bias from a system, whether that's a human system or an AI system. Um, but we do know a lot about the risks involved in , around biases in AI systems, and we're, we're learning more every day about it. Um, so by mitigating, what we're essentially saying is we think we know what the problem is, and we're trying to fix it. So what , what it is, is a , a harm reduction exercise. Um, but again, we focus on reduction of harm, or we focus on mitigating bias because ultimately these models, these AI systems are built on data, and data is ultimately , uh, representation and simplification of, of reality. So I don't believe there's ever gonna be a way to reduce reality down to data in a way that's completely bias free , but again, we can mitigate and we can reduce harm.

Speaker 4:

Okay . Yeah. If , if I can speak to that a little bit as well, because I , I think that's, this is really important point I a hundred percent agree with , with Sam, is it's just, it's not, it's not possible practically , um, or, you know, in a lot of cases , um, just not even theoretically possible to eliminate bias. Um, and so it's again, important to look at, you know , the fact that these, the ultimate goal is to , um, reduce inequity in the healthcare outcomes. And so looking at ways to mitigate bias, that may mean that you're not able to , um, significantly eliminate bias in the models and tools that you're using. But are there ways that you can reduce the negative impact on the eventual , uh, healthcare out outcomes and , uh, move towards more equitable outcomes? And it's important, I think also to put it in the historical context that, you know, there , you can spend a lot of time trying to eliminate bias, and at some point you get to a point where you have diminishing returns on that these are tools that are built and intended to, to, to benefit patients. And so if we can move forward and set up risk , risk mitigation, bias mitigation , uh, processes that allow us to reduce inequity as we actually move forward and improve , um, healthcare outcomes across groups, right? Um, then I think it's important to, to remember that and look at the fact that, you know, if we can reduce bias, reduce inequity with these tools , um, compared to the baseline inequities that are already in the healthcare system because of all these historical biases we're talking about , um, you know, we're , we, these, these mitigation strategies are set up to move us in the right direction and, and to , um, to , uh, to improve on those inequities that are already there. So that , that's, that's , um, also part of when you look at, for example, the , um, risk integration frameworks and the health , uh, equity across AI life lifecycles, it's really focused on can we , um, improve and reduce inequity? Not , uh, not, you know , um, force ourselves to eliminate them completely.

Speaker 3:

Um, my last question is, is building on this idea of, of , um, mitigating risk. And that is , uh, people talk a lot about explainability and transparency as a way to reduce general risk, you know , um, uh, other forms of risk. Are those techniques useful in this context? Uh, can, can explainability and transparency be used , uh, to reduce the risk of bias? Uh, Chris , do you wanna start?

Speaker 4:

Sure, yeah. I mean, absolutely as one way of , um, of reducing the risk of bias. Um, so, you know, we talk about explainability, interpretability , uh, what that really means in, in , um, the domain of AI is that you have models where , um, either in general you have , um, some ability to see how the model is, is making decisions, what information is being used to make those decisions. Um, and then , uh, for a specific patient or specific instance, what factors drove a particular prediction or a particular decision. Um, and as we, you know , uh, moved to , uh, we've moved forward and , and started to implement much more complex models, a lot of people are familiar with large language models and neural networks. Those are very much black box models in the sense that they are not , uh, transparent, they are not , um, they are not , uh, in interpretable , uh, by default. Um, and so, you know, what it means is that we can't really see how and why a people making a certain decision, whatever decision it is , um, if, if that is possible, if we can provide with , uh, a particular AI tool, not just a recommendation, a prediction , um, a decision, but also what are the factors that were used to do that? Why is it making that decision? Um, that information provided then to a clinician allows them to be more aware of , uh, where biases may be , um, coming into play and, and , and what factors or what, what things they need to do to mitigate that , that impact on the ultimate patient. Um, and then transparency , um, is another domain there or another , uh, concept there where we're talking about now, you know, the transparency in terms of what data was used to train the model , um, what analysis were done to , uh, assess bias, where was it identified that there was risk of bias. And so that transparency in the whole process , um, so that we know more about what we do know about the model, and that's being communicated from the developers of these tools down to the people that are responsible for , um, assessing their fitness for use in the clinical setting. And then , uh, ultimately to those clinicians and other , uh, end users, that transparency is really critical as well to , um, making sure that , um, that, that they understand what the limitations of these tools are , um, from a bias perspective. So yes, they help , um, but as , uh, as Sam was alluding to earlier , um, it's not a , uh, it's not an ultimate solution to eliminate bias so much as just , um, one of the ways that we can help to mitigate. Sam, do you have

Speaker 3:

Any particular examples maybe , uh, just to help the audience?

Speaker 5:

Yeah, absolutely. So I totally agree with everything that Chris has just said. You know, explainability is an element of transparency. Transparency and explainability are very important goals for AI systems. And again, working towards transparency and explainability can help mitigate biases. Um, and it, in doing so, it really forces developers to consider different perspectives. Um, so one example that I, that comes to mind is , um, so a , a , a developer that is striving for transparency, for example, may want to create a very detailed data sheet that describes the model's training data in great detail. And it summarizes, you know, the, the racial makeup of the training data, the gender makeup of the training, data intersectionalities and so forth. And in doing so, in creating this tool that is used for transparency, the developer may discover then that women are underrepresented in the training data, or there's a particular subset of women who are underrepresented in the training data. So this then could lead them to collect additional data or down sample men's observations or overrepresented groups' observations, and which then ultimately is going to lead to a final model that would be more likely to perform equally for men and for women if they hadn't gone through the exercise of creating the transparency tool.

Speaker 3:

I wanna thank you guys , uh, Sam and , and Chris for your comments today. I enjoyed our program two months ago in May , uh, in Chicago, and I enjoyed today as well, getting to hear your thoughts on how to manage , uh, algorithmic bias. And I'm confident that the audience , uh, uh, got some valuable insights . So my thanks to both of you and , uh, look forward to doing this again sometime. Take care.

Speaker 5:

Thank you. Thanks, Brad .

Speaker 2:

Thanks . Thank you for listening. If you enjoyed this episode, be sure to subscribe to a HLA speaking of health law wherever you get your podcasts. To learn more about AHLA and the educational resources available to the health law community, visit American health law.org.