Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Jipeng Liu, Binita Dahal, Md Shah Mominul Islam Momin

Controlled Experiment with Ground Truth

  • Goal: Validate TCAV using controlled settings with known “Ground Truth.”

  • Dataset: Three classes (zebra, cab, cucumber) with added text captions.

    • Captions are noise-controlled to test model reliance on images vs. text.
  • Process: Four models trained with varying noise levels in captions.

    • TCAVs generated to test if models focus on image or text for each class.

Controlled Experiment with Ground Truth

Quantitative Evaluation of TCAV

  • Objective: Measure TCAV’s accuracy against known ground truth.
  • Findings: TCAV scores reflected the true concept used by models.
  • Cab Class: Models prioritized image over text, with high TCAV scores for image.
  • Cucumber Class: TCAV identified model reliance on text or image depending on noise level.
  • Conclusion: TCAV scores align with ground truth, indicating reliable concept identification. Cab Image

Evaluation of Saliency Maps

  • Saliency Maps: Traditional method highlighting input features relevant to predictions.
  • Experiment: - Human participants rated concept importance using saliency maps. - Evaluated across noise levels and map methods.
  • Results: - Saliency maps showed limited accuracy (52%) in concept importance. - TCAV provided clearer, more interpretable insights. Image 3 Image 4

Medical Application

Predicting Diabetic Retinopathy (DR) Levels

  • Model: Predicts DR level (0 to 4) from retinal images.

  • TCAV for Concept Analysis: Identified diagnostic concepts relevant for each DR level.

  • Example:

    • High TCAV scores for “microaneurysms” at DR level 4

    - Detected inconsistencies between model and expert knowledge at lower DR levels

Summary

Benefits of TCAV

  • Human-friendly, interpretable model insights
  • Works post-hoc on any model, adaptable for various applications
  • Effective in identifying biases and understanding model focus

Future Directions

  • Apply TCAV to other data types (audio, text, etc.)
  • Use TCAV for adversarial detection and robustness testing
  • Potential for automatic concept identification