Search
Generic filters
Exact matches only
Filter by content type
Users
Attachments

Explainable AI: Opening the Machine-Learning Black Boxes

In recent years, explaining machine-learning models gained strength in the AI field. The main idea is to provide meaningful information about the algorithms in such a manner that humans without a technical background can understand how it works and potentially identify unfairness and bias.

27 May 2019. Artificial Intelligence (AI) has allowed an unprecedented level of automation in our society, powering computers to execute with reasonable confidence tasks previously restricted to humans, such as translating a text or recognising objects in a photograph. AI has also been crucial for identifying patterns in very large data sets in tasks out of the human capabilities, like fraud detection and recommender systems.

However, in no small extent, this high performance comes with the cost of transparency. AI-complex statistical models and neural networks are generated from an intricate set of computational operations from which there is no easy way to associate a given input data and its impact in the trained model. For this reason, AI has been frequently called a “black-box” technology.

Rules to prevent bias and unfair behaviour

Afraid of the impact of those opaque models, the European Parliament defined legal rules to prevent bias and unfair behaviour. Also, this fear is not without reasons. Many studies have shown that, as AI models learn patterns from human-generated data sets, they are also learning the bias present in society. As a simple example, the de-facto method to represent words as vectors, called word embeddings, creates a higher association between the words “man” and “success”, than “woman” and “success”. Other study has also shown that having a name frequently used by a minority group, such as Ebony for Afro-Americans, has a negative impact in AI-based sentiment analysis, compared to a name associated with European Americans, like Amanda. Other types of prejudice were also found in credit-score systems when assessing loan risk.

Explaining machine-learning models
Given these challenges, in recent years, the research area devoted to explaining machine learning models gained strength in the AI field. The main idea is to provide meaningful information about the algorithms in such a manner that humans without a technical background can understand how it works and potentially identify unfairness and bias. The area of Explainable AI, moreover, goes even further.

Zachary C. Lipton defined a comprehensive taxonomy of explanations in the context of AI, highlighting various criteria of classification such as motivation (trust, causality, transferability, informativeness and fairness & ethics) and property (transparency and post-hoc interpretability).

Trust is, by far the most common motivation presented in the literature. Or Biran and Courtenay Cotton, for instance, show that users demonstrate higher confidence when using a system that they understand how it works. Fairness & Ethics is also a strong driver as the well-known European General Data Protection Regulation (GDPR), which guarantees both rights “for meaningful information about the logic involved” and “to non-discrimination” to prevent bias and unfair behaviour, mainly targeting decision-making algorithms. Although less representative, explanations are also used to support users’ feedbacks to intelligent systems.
From the property criterion, transparency allows understanding the algorithm’s mechanism of decision, by contemplating “the entire model at once” and understanding each of its parts and its learning mechanism. Typical methods complying with these requirements are the so-called “explainable by design” such as linear regression, decision trees and rule-based approaches when dealing with small models.

Post-hoc explanations, on the other hand, make use of interpretations to deliver meaningful information about the AI model. Instead of showing how the model works, it presents evidence of its rationale by making use of (i) textual descriptions, (ii) visualisations able to highlight parts of an image from which the decision was made, (iii) 2D-representation of high-dimensional spaces or (iv) explanation by similarity. Although this type of explanation does not tell precisely how the output was generated, it still presents useful information about its internal mechanism.

Post-hoc explanations of AI models
One of the works developed in our research team offers an example of a post-hoc explanation for the task of text entailment, in which, based on a given fact the system evaluates whether a second statement (the hypothesis) is true or false. In this work, the approach makes use of a graph representation of the words’ meanings to navigate between the definitions based on a semantic similarity measure.

For instance, assuming the fact “IBM cleared $18.2 billion in the first quarter”, and the hypothesis “IBM’s revenue in the first quarter was $18.2 billion”, it provides not only a yes/no answer but also an explanation: Yes, it entails because “To clear is to yield as a net profit” and “Net profit is synonym of revenue”.

Improving text entailment systems affects the performance of many natural language processing tasks such as question answering, text summarization, and information extraction, among others. Whereas this task seems trivial for a human, in fact, it represents the essence of the challenge that modern artificial intelligence models face: understanding meaning.

Siegfried Handschuh is full Professor of Data Science at the University of St.Gallen.

Image: Adobe Stock/Oleksii

Author: Siegfried Handschuh

Date: 27. May 2019