Invented by Piyush Gupta, Nikaash Puri, Balaji Krishnamurthy, Adobe Inc

The market for machine learning model interpretation is rapidly growing as businesses and organizations recognize the importance of understanding and explaining the decisions made by these complex algorithms. Machine learning models have become increasingly prevalent in various industries, including finance, healthcare, retail, and marketing. However, their black-box nature has raised concerns about transparency, accountability, and potential biases. Machine learning models are designed to learn patterns and make predictions based on large amounts of data. While these models can achieve remarkable accuracy, they often lack interpretability, making it difficult for humans to understand how and why certain decisions are made. This lack of transparency can be problematic, especially in high-stakes applications such as credit scoring, medical diagnosis, or autonomous vehicles. The market for machine learning model interpretation aims to address these challenges by developing tools and techniques that can provide insights into the decision-making process of these models. Interpretability methods can help answer questions like which features are most influential in the model’s predictions, how different inputs affect the output, and whether the model is biased towards certain groups or variables. One approach to model interpretation is through the use of feature importance analysis. This technique identifies the most influential features in the model’s decision-making process. By understanding which variables have the greatest impact, businesses can gain insights into the factors driving their predictions and make informed decisions accordingly. Feature importance analysis can also help identify potential biases in the model, allowing organizations to address and mitigate them. Another method for model interpretation is the use of rule extraction algorithms. These algorithms aim to extract human-readable rules from the black-box model, providing a more intuitive understanding of its decision-making process. This can be particularly useful in regulated industries where explainability is required, such as healthcare or finance. Rule extraction algorithms can also help identify cases where the model’s predictions deviate from human intuition, enabling organizations to refine and improve their models. The market for machine learning model interpretation is also driven by regulatory requirements. In some industries, such as finance and healthcare, regulations mandate that models be explainable and transparent. For example, the General Data Protection Regulation (GDPR) in the European Union requires organizations to provide individuals with explanations of automated decisions that significantly affect them. As a result, businesses are increasingly investing in interpretability tools and techniques to comply with these regulations and avoid potential legal and reputational risks. Furthermore, the market for machine learning model interpretation is fueled by the need for model validation and risk assessment. Organizations are increasingly aware of the potential risks associated with using black-box models without proper understanding and validation. Model interpretation tools can help assess the reliability and robustness of these models, ensuring that they are not making biased or unfair decisions. Several companies have emerged in this market, offering a range of interpretability tools and services. These companies provide solutions that can be integrated into existing machine learning workflows, making it easier for businesses to interpret and explain their models. Some tools offer visualizations and dashboards that allow users to explore and understand the inner workings of their models, while others provide automated reports and explanations. In conclusion, the market for machine learning model interpretation is growing rapidly as businesses and organizations recognize the importance of understanding and explaining the decisions made by these complex algorithms. The need for transparency, accountability, and risk assessment is driving the demand for interpretability tools and techniques. As regulations become more stringent and the risks associated with black-box models become apparent, the market for machine learning model interpretation is expected to continue expanding in the coming years.

The Adobe Inc invention works as follows

A technique for generating class-level rules that explain global behavior of a machine-learning model is disclosed, such as one that was used to solve a problem in classification. Each class-level rule is a logical statement that predicts the membership of an instance of a class when it holds true. These rules collectively represent the pattern that the machine learning model follows. These techniques are model-agnostic and can explain model behavior by generating a set logical rules which is easily parsed. The techniques can be used for any application, but in certain embodiments they are best suited to interpret models that perform classification. “Other machine learning models can also benefit.

Background for Machine Learning Model Interpretation

Machine Learning refers to techniques that automatically generate computer-executable (learning) instructions without explicit programming. The process of machine learning involves creating a model to describe the relationship between known inputs, and outputs. Machine learning models are used to predict outputs for given inputs. The machine learning model is a “black box”, meaning that it cannot be interpreted definitively into a set of rules that describes all transfer characteristics. This is particularly true for more complex models, such as neural network, random forest, gradient-boosted trees, etc. Interpretability is improved by simpler models, such as logistic regression, linear regression and decision trees. There is therefore a tradeoff in model accuracy and interpretationability. A computer scientist can, in some cases, evaluate the accuracy of an essentially simple machine-learning model by applying test results to the model, and comparing its output with the expected result. It is difficult to validate more complex models with high confidence by using only test data, as it may not always be feasible or possible to analyze all possible input sequences. Model interpretation is therefore an important component of the validation process. Interpreting the behavior of a machine learning black box model can be useful in understanding the model’s global behavior. This understanding could provide insights into the data used to train the black box machine-learning model and the generalization ability of the rules it has learned. There is a need to improve techniques for interpreting machine-learning models.

As mentioned above, interpreting the behaviour of a black-box machine learning model can be useful to understand how well it performs (for instance, confirm that the classifier is correct). The interpretation of a machine learning model is useful to understand how it behaves on a global level. This understanding gives granular insight into the data that was used to train the model and the generalization ability of the rules learned. In order to achieve this, the techniques described herein are used to generate a set rules from a collection of instance-level conditions. The rules then describe the behavior of the machine learning model on a global scale, for example, a model which has been applied to solve a problem of classification. The rules collectively represent the pattern that a machine learning model follows and can be used to gain insight into its behavior. These techniques are model-agnostic and provide a simple way to explain model behavior by generating a set rules. The techniques can be used for many applications. However, they are particularly useful in interpreting classification models. As will be appreciated, other machine learning models can also benefit.

For example, according to a specific example embodiment, the system is configured or programmed to learn rules to explain the behavior of a classifier model. Each rule is distinct and can be expressed as, for instance, “If C1 ANDC2 AND”. . . Then Predict class K.? Ci is a condition, such as?18.

Examples of use cases provide useful information on how to employ the techniques described herein. As an example, suppose a machine learning model was used to predict the likelihood of borrowers defaulting on a loan. A rule explaining the behavior of the machine learning model could be something like: “If the borrower’s annual income is below $20,000, and the borrower is married then default is most likely.” This rule is useful for several reasons. It allows the analyst or developer (generalized as ‘developer’ going forward) to determine and then reject any spurious or incidental patterns in the training data that the model may have picked up on. First, it allows the developer or analyst (generalized to?developer? going forward) the ability to determine and then reject any spurious patterns that may have been picked up by the model. This rule also allows the developer to gain insight into the problem domain. In the case of loan refusal, for example, by using techniques to extract rules from a model trained on the last five year’s loan rejection data and training it, the developer can gain deep insights into the patterns of the original data. By analyzing rules relating to geo-locations, the developer can learn about implicit bias and its role in loan rejection. A machine-learning application in medicine is another example. The techniques described herein can then be used to extract the rules that describe the behavior of the machine learning model. These rules could help the physician to diagnose the generalizability and patterns of previous diagnoses of the machine-learning model. A second example is for data scientists and machine-learning developers. The techniques described herein can be used as a debugging method in such situations. The rules output explains the behavior and allows developers to assess the quality of the rules learned and take appropriate actions. For example, appropriate actions include retraining the model if the output rules show that it was trained with the wrong data, or deploying the system if the output rules show that the model meets the developer’s goals.

According to an embodiment of this disclosure, methods are provided that allow a machine-learning model to be translated into a set rules that describe its behavior. If the model is a classifier, for example, the rules could collectively describe the way the model classes various inputs. The computer-implemented methods include receiving data that represents a machine learning (which has been already trained), a collection of training data and a selection of output classes to classify a plurality instances of the data. Each instance represents one or more features of the training data. A feature is a variable, or a value, for a particular instance of data. The method also includes generating instance level conditions, by applying each instance, and at least a perturbation, to the machine-learning model, and calculating a marginal contribution for each feature in the respective instance, based upon an output from the machine-learning model. Each instance level condition represents the range of values that each feature has to have a maximum marginal contribution to an output class. The method also includes generating class level rules. By applying the instance-level conditions to each of the corresponding cases, a genetic algorithm is created. Each class level rule is a logical statement that predicts the membership of an instance or instances in a class when it holds true. A portion of the class level rules set can be displayed or stored to later retrieve and interpret the model. This disclosure will reveal a variety of configurations and variations.

Example Workflow

FIG. According to an embodiment of this disclosure, Figure 1 illustrates an example workflow for the development of a machine-learning model. A machine learning model is trained 104 using a set 102 of training data that represent inputs and outputs expected from the model. The inputs can be values or sets of value that belong to the same classes as the outputs. A data scientist can verify 110 the accuracy of the model after it has been trained by applying test data to the model and comparing its outputs with the results that the data scientist expected. The accuracy of the model can improve by repeatedly adjusting hyper-parameters 114, retraining and revalidating until it is sufficient.

The above described machine learning model development 100 can be enhanced by validating the models not only by using test data 112, but also by interpreting the transfer characteristics into a set rules that explain the model’s behavior. The transfer characteristics define the relationship between inputs and out puts. The transfer characteristics of a model for machine learning can be represented as a logical statement or rule. For example: if condition A and B are true, then the model will predict that the input combination condition A + condition B belongs to class C. Model interpretation can help answer questions like ‘Do I understand my models? Does my model perform as intended? Do I trust my model? The data scientist can analyze the rules to better understand the data that was used for the model’s training (in the form patterns discovered by the model) and the model itself. The data scientist can also identify patterns that were not intended to be in the model. The model interpretation can also be used to enhance the training data, or to adjust hyper-parameters of the model before training the next version of the model. If the model is a network of neural nodes, and it has only learned very specific patterns, the data scientist may try to reduce the number or hidden layers in the model, or the number (or nodes) of hidden layers per layer, to force the system to learn more general patterns from the training data.

However the transfer characteristics in machine learning models is intrinsically invisible and not easily interpretable. This is particularly true for models that are more complex, such as neural network, random forest, and gradient-boosted trees. The complexity of machine-learning algorithms, which allows them to be effective, also makes it difficult to understand their inner workings. To increase the interpretability and accuracy of machine learning algorithms, it is common to use simpler models such as logistic regression, linear regression, or decision trees. These models are easier to understand but less accurate. By looking at the weights that a Linear Regression Model has learned, you can determine the relative importance for different features. These models make it easier to justify the decisions they take. It is not always feasible to use models with lower accuracy in production. There is therefore a trade-off to be made between model accuracy, and model interpretability.

To increase accuracy, as well as to represent non-linear functions, it is necessary to use more complex models, which are less interpretable. As mentioned above, the most popular are neural networks, random forest, gradient-boosted trees, etc. These approaches are known for their ability to model complex patterns, and therefore achieve greater accuracy than simpler alternatives. The cost of this accuracy gain is the model’s interpretability.

Some existing model interpretation techniques try to explain model behaviour locally at the example level. In this disclosure, ‘instance level’ is used. Refers to a subset or a set of training data. For example, a single column of data within a table of several columns of data where each row represents at least one training feature and a different record. These techniques can be used to explain why a certain instance has been classified into a specific class. A local interpretation is not useful in understanding the model’s learning across all the training data. The model must be tested on several instances before it can provide a more accurate interpretation.

Surrogate models are another way to interpret models. This approach involves learning a decision-tree using the model’s training data. Instead of predicting the actual classification of the data the tree is taught to predict that classification. Then the rules used by the model to make the predictions are output as the paths of the trees. Surrogate models such as decision tree have only one root node. Therefore, all rules extracted from these trees include the attribute of root node in their description. Even on relatively simple data sets, decision trees can be complex and have paths that span multiple features. This can result in rules with a high number of feature-value pairs that are not comprehensible.

According to an embodiment of the disclosure, techniques for interpreting machine learning models into a set rules that explain behavior of the model are provided, especially for classification models. Interpretation techniques produce a global set of rules to explain behavior of the models from local explanations. Analyzing global behavior has the advantage of helping to uncover patterns in the data used by the model for classification. One such pattern could be that “males in California with incomes above $100,000 are always granted loans.” This is different from existing approaches which are limited to local level explanations. They are only useful to interpret why a model classes a particular instance in a class.

System Architecture

FIG. In accordance with a particular embodiment, FIG. 2 illustrates an example of liar machine-learning model interpretation system 200. The system 200 comprises a computing device having a processor, GUI, and machine learning model application. The computing device 292 is configured to execute an application 208 that includes a condition generation module at the instance level 210, class level rules generated 212, post-processing module 214 and a rule selection modules 216. The computing device is configured to receive the data corresponding to the machine learning model from a storage 220–which includes a database, or any other suitable data storage device. In some embodiments the storage 220 can be implemented on a remote server, which is in communication with a computing device, via a network such as an intranet or the Internet. The storage 220 may include any digital storage device that is configured to store data digitally encoded. The data on the storage device 220 may also include training data and test data. It could also include production data and classification data.

As will be explained in more detail below, the System 200 is configured so that it can interpret the transfer characteristics from the machine learning model into a set rules which explain the behavior of a model and may be used to validate if the model. In a nutshell, the module 210 for instance level condition generation is configured to create a set instance level conditions based on the model 106 and the output classification data. The class-level rule generation module is configured to create, from the example level conditions, a series of class-level rules that describe model behavior. Class level rules take the form of logical statements (e.g. if-then statements), which explain how the model classes input data. The post-processing modules 214 are configured to remove redundant class level rules from the rules generated by the module 212. The rule selection module is configured to select the subset of rules generated by class level rule generator module 212 in order to provide the most concise and accurate set of rules which adequately interprets the model.

Use Case Example

The machine learning model interpretation methods described in this disclosure may be used in full or in part on the system 200 in FIG. 2 . These interpretation techniques are not model-specific and can be applied to any set of data for training during the development of a machine learning model 100 in FIG. 1 . Model agnostic approaches do not use the underlying details in machine learning models to construct explanations. The model is instead treated as a “black box” and the interpretation depends on its predictive function (e.g. classification). The behavior of complex ensembles of personalized algorithms can be interpreted. A data scientist can then focus on creating ensembles which maximize accuracy without compromising interpretability. This is different from existing approaches which rely on the internal structure of the algorithm used to build the model and are thus restricted to specific, non-generic types of models.

The machine learning model interpretation methods described in this disclosure are a powerful tool for analyzing and visualizing high-dimensional data sets. These interpretation techniques allow patterns to be extracted from data by training a model with such data and interpreting it using the interpretation techniques. The data scientist can use the high accuracy of the model to understand patterns in the data and the behavior of model.

As explained previously, each class-level rule is independent and can be in the form of?IF C1 AND. . . THEN Predict class K.? Ci is a condition such as?15

A machine-learning model developer can read the rules generated using the disclosed model interpretation methods. The developer can then reject or ignore any spurious patterns that may have been picked up by the model. Developers can also gain insight into the problem domain. By using the interpretation technique to extract rules from a model trained on the five most recent years of loan rejections, and continuing the example above about the grant of a loan, the developer could gain insights into patterns that were not apparent in the original data. By examining geo-location rules, for example, rules can reveal the role implicit bias plays in the rejection of loans. The developer can then examine the quality of the rules learned and decide what action to take.

Click here to view the patent on Google Patents.