Human-in-the-loop Discovery of Interpretable Concepts in Deep Learning Models

Title	Human-in-the-loop Discovery of Interpretable Concepts in Deep Learning Models
Summary	Interactive discovery of disentangled and interpretable concepts in Deep Learning Models
Keywords	disentangled learning, representation learning, human-in-the-loop, explainable AI
TimeFrame	Fall 2022
References
Prerequisites
Author
Supervisor	Ece Calikus
Level	Master
Status	Open

Learning interpretable representations of data that expose semantic meaning has numerous benefits for artificial intelligence, including de-noising, imputing missing values, reducing bias, and interpretable latent spaces for better insight into the application domain. However, most deep learning methods cannot guarantee that lower dimensional latent representations are semantically meaningful to humans as a concept. Disentangled representation learning is an unsupervised learning technique that breaks down, or disentangles, each feature into narrowly defined variables and encodes them as separate dimensions. The goal is to mimic the quick intuition process of a human, using both “high” and “low” dimension reasoning. A representation is considered a disentangled representation if a change in one dimension corresponds to a change in one factor of variation while being relatively invariant to changes in other factors. For example, a disentangled representation would represent gender, hair color, age, and similar features of a face image as separate dimensions of the latent embedding. However, not every disentangled feature is useful for the certain downstream task. Features such as the border shape, the color, or the size of a skin lesion are useful to detect skin cancer, while the same features are not equally useful for other tasks.

In this project, we propose an interactive framework to integrate human knowledge in the visual concept extraction process and use the identified concepts to improve the prediction performance of the downstream task. We will also investigate whether some features are meaningful and transferable across different tasks and domains.

Human-in-the-loop Discovery of Interpretable Concepts in Deep Learning Models

Navigation menu

Views

Personal tools

Search

Tools