Solving counterfactual explanations for decision trees and forests

December 14, 2022

Miguel Ángel Carreira Perpiñán


Solving counterfactual explanations for decision trees and forests

Time:   11:00am
Location:   Zoom3 - https://zoom.us/j/3911012202 (pass: @s3)

A counterfactual explanation refers to the problem of finding a minimum change to a given input instance that will result in a desired prediction under a given, trained machine learning model. This is useful to extract actionable knowledge such as “reducing your weight by 10 kg will reduce your risk of stroke by 80%” (regression) or “you will be eligible for the loan if you increase your annual salary by $10k” (classification). Counterfactual explanations are closely related to adversarial examples, where one seeks to trick a model into misclassifying a given instance; and to model verification, where one seeks to prove (or find a counterexample to) a given property of a model.

A counterfactual explanation can be formulated as an optimization problem of minimizing the change (in some distance) to the input instance subject to the model producing the desired prediction. Here we focus on classification and regression models consisting of a decision tree or forest, which are widely used in practice when either high interpretability or high accuracy are desired, respectively. For a single tree (axis-aligned or oblique), we show that the problem can be solved exactly and efficiently. For an ensemble of trees (a forest), such as random forests or gradient boosted trees, the problem is much harder and must be solved approximately. We give two algorithms that scale to large forests: one is very fast and also encourages the solution to be somewhat realistic; the other is slower but more accurate.

This is joint work with my student Suryabhan Singh Hada.