Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Jipeng Liu, Binita Dahal, Md Shah Mominul Islam Momin

Controlled Experiment with Ground Truth

Goal: Validate TCAV using controlled settings with known “Ground Truth.”
Dataset: Three classes (zebra, cab, cucumber) with added text captions.
- Captions are noise-controlled to test model reliance on images vs. text.
Process: Four models trained with varying noise levels in captions.
- TCAVs generated to test if models focus on image or text for each class.

Controlled Experiment with Ground Truth

Quantitative Evaluation of TCAV

Objective: Measure TCAV’s accuracy against known ground truth.
Findings: TCAV scores reflected the true concept used by models.
Cab Class: Models prioritized image over text, with high TCAV scores for image.
Cucumber Class: TCAV identified model reliance on text or image depending on noise level.
Conclusion: TCAV scores align with ground truth, indicating reliable concept identification.

Evaluation of Saliency Maps

Saliency Maps: Traditional method highlighting input features relevant to predictions.
Experiment: - Human participants rated concept importance using saliency maps. - Evaluated across noise levels and map methods.
Results: - Saliency maps showed limited accuracy (52%) in concept importance. - TCAV provided clearer, more interpretable insights.

Medical Application

Predicting Diabetic Retinopathy (DR) Levels

Model: Predicts DR level (0 to 4) from retinal images.
TCAV for Concept Analysis: Identified diagnostic concepts relevant for each DR level.
Example:
- High TCAV scores for “microaneurysms” at DR level 4
- Detected inconsistencies between model and expert knowledge at lower DR levels

Summary

Benefits of TCAV

Human-friendly, interpretable model insights
Works post-hoc on any model, adaptable for various applications
Effective in identifying biases and understanding model focus

Future Directions

Apply TCAV to other data types (audio, text, etc.)
Use TCAV for adversarial detection and robustness testing
Potential for automatic concept identification