Building trust in ML models

Building trust in ML models

ML models are getting harder and harder to explain to the end user. Instead, you can provide a simpler mental model and a UI to help the end user verify that your complex model matches the mental model.


Overview

Trust in ML models, particularly in healthcare, is a major issue right now. Many clinicians do not trust “black-boxes” that tell them what to do (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334754/).    

A common approach to address this nowadays is to provide detailed workings of the model through complicated explanations. Unfortunately, many times this only serves to make the audience feel more confused due to the sheer complexity.

In addition, ML Models are getting harder and harder to explain as newer models use deep learning algorithms, hundreds of features, and ensemble techniques that are not as simple to explain as a linear model at a global population level.

Instead, we can provide simpler mental models (https://en.wikipedia.org/wiki/Mental_model) that are easier to understand and then provide tools for users to test various inputs and confirm that they get the outputs from the real model that match the expected outputs of the mental model.

Let’s start with an example of how we build trust in other complex systems…


Driving an electric car for the first time

 A couple of years ago, I was in New Orleans and decided to rent a car on Turo so my family and I could tour the plantations. I saw a Tesla for rent and decided to rent it (even though I had never driven an electric car before).   

No alt text provided for this image

Photo by Bram Van Oost on Unsplash


Then I “tried” out the car against my mental model of how a car should work…

Pressing on accelerator makes the car go and if I press more it goes faster… check!

Pressing on the brake pedal stops the car… check!

Turning the steering wheel turns the car and I can make turns onto other roads… check!

There is some number on the dash that tells me how many miles before the car will stop running… check!

Based on these checks I trusted the car with being able to successfully drive my family for a few days.

Of course, I had no understanding of the actual workings of Tesla engine and I’m pretty sure I wouldn’t understand it even if someone tried to explain it.

No alt text provided for this image

(Source: https://techau.com.au/sandy-munro-will-continue-youtube-teardowns-after-tesla-model-y-proves-successful-with-4-6m-views/)


In effect what I did was create a much simpler mental model and confirmed that my inputs resulted in the expected outputs.


No alt text provided for this image


Let’s see an example of how the same process applies to a doctor looking at a ML model…


Convincing a doctor to trust our ML model


A few years ago, I sat down with a doctor to convince him to use our model for identifying care gaps. He immediately left the room without saying a word. I feared that all hope was lost. A few minutes later he re-appeared with printouts of a few patient charts. He then recited a patient’s MRN and asked me to show the care gaps we had identified for that person. This went on for about five patients. Then he cautiously said, “I guess we can give it a try”.

A few months later, when we had established a relationship with each other, I asked him what he was doing that first day I met him. He said he has chosen one of his patients who had serious health problems, one patient that was fairly healthy, a woman who had breast cancer and one patient he had seen recently where he had resolved most of that patient’s care gaps. In fact, he was really checking to see if our ML model fit his mental model of care gaps.

This is, of course, no different than what I did with the Tesla in New Orleans. He was trying out some known inputs and checking if he got the outputs he expected based on his mental model.

No alt text provided for this image


ML Models are getting harder to explain

As ML models are getting more and more complex, explaining them to a non-data scientist will get harder and harder. Explaining a simple linear model is possible but explaining a generalized linear model, a gradient boosted model, an ensemble model, or a deep learning model is extremely hard. 

Here’s an example of a LIME explanation of a model:

No alt text provided for this image

(Source: https://www.kdnuggets.com/2019/12/interpretability-part-3-lime-shap.html)

In fact, attempts to explain models using SHAP, LIME or similar techniques can backfire because those explanations tend to only work for subsets of the population and not the whole population. (In a previous company, Iman Haji, Alvin Henrick, and I did some work to generate explanations that apply to the whole population: https://github.com/clarifyhealth/transparency).

Don’t get me wrong. Explanations in ML are great for data scientists to understand what their models are doing, to identify and remove bias, and to improve the accuracy of their models. Explanations can also be great to identify patterns that apply to subsets of the population. However, trying to use them to convince non-data scientists can be problematic.


UI to Explore Mental Model

So, what kind of UI can you provide to enable end users to check that your real (incredibly complex) model matches the mental model?

First, the UI should allow the end user to check a sample record (e.g., a patient) of their choosing and see how the features and predictions of that record differ from the other records (both from the averages in population and from a specific other record). By letting them choose the two records, you enable them to select records that they are currently familiar with and hence, know what the prediction should be. For example, two patients they saw recently.

Second, provide them a “what-if” UI where they can manipulate features of an existing record to see the effect on the prediction. For example, if they change the gender from female to male or increase the age of a known patient, they should be able to see that the risk for breast cancer drops in the former case and the total costs go up in the latter.

Just like in my Tesla and doctor examples above, this allows the end user to confirm that the real model behaves the same as their mental model.


Summary

Next time you must convince someone to trust your ML model, instead of explaining the detailed inner workings of your model, try to give them a mental model and a UI where they can easily check that the results of your real model match the expected results from the mental model. Let them try a few inputs that they know and give them ability to try some “what-if” inputs. If they see the outputs they expect, they will tend to trust your model (more).

Marcus Turner

Technical Leader, Father, & Athlete: Over 2 decades of scaling teams and solving complex technical issues worldwide. In other words: A major People/Process/Technology Nerd.

2y

Thanks Imran Qureshi for sharing. Your simple diagram on the lack of mammograms (exacerbated by COVID) is a great use of a simple population health model on risk stratification! Continue doing great things with b.well Connected Health's data set (and hopefully federated ones in the future). Also, if you were serious ... I would be happy to explain the "inner workings" of an AC induction motor. 😉

Iman Haji

GenAI Tech Lead at Google

2y

Nice read, thanks for the mention Imran Qureshi!

Vasanth Thirugnanam

Associate Director Data Science at Johnson & Johnson Innovative Medicine

2y

Very simple and easy read! Thanks Imran.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics