Awesome Open Source
Awesome Open Source

Hierarchical neural-net interpretations (ACD)

Produces hierarchical interpretations for a single prediction made by a pytorch neural network. Official code for Hierarchical interpretations for neural network predictions (ICLR 2019 pdf).

Documentation Demo notebooks

Note: this repo is actively maintained. For any questions please file an issue.

examples/documentation

  • installation: pip install acd (or clone and run python setup.py install)
  • examples: the reproduce_figs folder has notebooks with many demos
  • src: the acd folder contains the source for the method implementation
  • allows for different types of interpretations by changing hyperparameters (explained in examples)
  • all required data/models/code for reproducing are included in the dsets folder
Inspecting NLP sentiment models Detecting adversarial examples Analyzing imagenet models

notes on using ACD on your own data

  • the current CD implementation often works out-of-the box, especially for networks built on common layers, such as alexnet/vgg/resnet. However, if you have custom layers or layers not accessible in net.modules(), you may need to write a custom function to iterate through some layers of your network (for examples see cd.py).
  • to use baselines such build-up and occlusion, replace the pred_ims function by a function, which gets predictions from your model given a batch of examples.

related work

  • CDEP (ICML 2020 pdf, github) - penalizes CD / ACD scores during training to make models generalize better
  • TRIM (ICLR 2020 workshop pdf, github) - using simple reparameterizations, allows for calculating disentangled importances to transformations of the input (e.g. assigning importances to different frequencies)
  • PDR framework (PNAS 2019 pdf) - an overarching framewwork for guiding and framing interpretable machine learning
  • DAC (arXiv 2019 pdf, github) - finds disentangled interpretations for random forests
  • Baseline interpretability methods - the file scores/score_funcs.py also contains simple pytorch implementations of integrated gradients and the simple interpration technique gradient * input

reference

  • feel free to use/share this code openly
  • if you find this code useful for your research, please cite the following:
@inproceedings{
   singh2019hierarchical,
   title={Hierarchical interpretations for neural network predictions},
   author={Chandan Singh and W. James Murdoch and Bin Yu},
   booktitle={International Conference on Learning Representations},
   year={2019},
   url={https://openreview.net/forum?id=SkEqro0ctQ},
}

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (1,143,903
Jupyter Notebook (242,165
Machine Learning (31,776
Deep Learning (23,747
Pytorch (11,619
Data Science (9,208
Neural Network (8,625
Artificial Intelligence (5,567
Ai (4,821
Statistics (4,340
Convolutional Neural Networks (3,727
Deep Neural Networks (2,866
Ml (1,577
Explainable Ai (258
Interpretability (231
Explainability (86
Feature Importance (55
Interpretation (41
Iclr (39
Acd (16
Related Projects