Machine learning models cannot be left on auto-pilot

I hosted a panel recently at an insurtech investment conference on the deployment of machine learning (ML) and artificial intelligence (AI) in the underwriting process. The panelists were data scientists and thought leaders from both insurtech start-ups and a large, established insurer. I was expecting a lively discussion about the transformative potential and technical challenges of deploying AI/ML in underwriting models. What transpired was very different. Most questions from the audience related to the ethical pitfalls of relying on ML to decide whether to offer an individual insurance, and at what price.

There is enormous potential in applying ML to insurance underwriting. Large insurers hold decades of data on customers and claims. There will undoubtedly be new correlations buried in this data that only the advanced analytical power of ML can uncover. In addition, enormous amounts of new data — provided by telematics devices in cars for instance — can be analyzed using ML to find new correlations with significant predictive power when seeking to answer, “what is the likelihood this person will have a claim?” Carriers that are early to exploit the potential of ML in underwriting should be able to price more accurately, allowing both higher top line growth and reduced claims. But there are risks that ML models, if not continuously monitored, gradually develop unintended biases that can produce unethical outcomes.

Black Box Models

In 2019, Apple and Goldman Sachs partnered to launch an Apple-branded credit card. The two companies relied on “black box” machine learning models to make credit underwriting decisions and – unbeknown to the teams at Apple or Goldman – the gender of the applicant seemed to have a significant impact on credit limits offered to customers. This only came to light when several men complained on social media that they had been offered one credit limit while their wives (with whom they shared all income and household assets) were offered lower limits — up to 95% lower in some cases. The incident was made more embarrassing when Steve Wozniak, one of the co-founders of Apple, shared via Twitter that his wife had been offered worse commercial terms than him.

Similar accusations have been made previously against UnitedHealth Group. The health insurer used its “Impact Pro” algorithm to determine which patients they should offer more complex and expensive care to prevent further medical claims down the line. The program seemed to favor offering expensive treatment to white patients over black patients and New York regulators insisted that the health insurer stop using their Impact Pro algorithm until they could demonstrate the absence of bias.

In each case, it is not clear (if not unlikely) that the model was explicitly using gender or ethnicity to inform a decision. There was no simple rule that said, “if male, then higher credit limit.” Rather, the challenge arises when a seemingly innocuous factor is used by the algorithm and happens to be highly-correlated with a protected characteristic like gender or race. For example, an algorithm trained on historical data could look at customer names and decide to offer better underwriting terms to those with names that traditionally belong to white men, while offering higher rates to black or Latina women. Similarly, certain postal codes are predominantly white or ethnic minority-populated and an unsupervised algorithm could use this factor to offer different terms even if the algorithm has no access to the customer’s ethnicity when making a decision.

Regulators on High Alert

Regulators and consumer advocacy groups have begun to take notice. In the UK, Citizens Advice has recently called upon financial regulators to investigate discriminatory pricing for ethnic minorities. In the US, the National Association for Insurance Commissioners holds regular working groups on Big Data and Artificial Intelligence with a particular focus on model governance.

For insurers looking to deploy ML, it will be imperative to adopt systems that provide control and transparency over decision-making processes. A new wave of insurtech and fintech startups like Monitaur.ai offer automated auditing of ML models to insurers. Other entrepreneurs are building underwriting and pricing models that use ML but are transparent by design, like actuarial software company Akur8.

The application of machine learning and AI in insurance underwriting offers enormous potential. Elsewhere in the industry this transformative technology can deliver greater automation and improved customer experience.

Insurers face a commercial imperative to adopt ML but must be prepared to account for the decisions they make.