Network Dissection量化网络的可解释性(interpretability)

最新推荐文章于 2024-05-06 01:44:12 发布

winycg

最新推荐文章于 2024-05-06 01:44:12 发布

阅读量2.2k

点赞数 2

分类专栏：深度学习与pytorch

本文链接：https://blog.csdn.net/winycg/article/details/102668430

版权

深度学习与pytorch 专栏收录该内容

49 篇文章 19 订阅

订阅专栏

引言

作者提出了量化网络latent representation可解释性的框架：评估单一隐藏单元与语义概念的对齐关系。给定一个CNN，可以对某一卷积层的隐藏单元的语义进行打分。语义具有如下的标签：objects(目标)，parts(物体的一部分)，scenes(场景),textures(纹理)，materials（材料），color(颜色)。 Broadly and Densely Labeled Dataset (Broden) 收集了多个数据集的图片并进行了密集的pixel-wise的标注，除了纹理（texture）和场景（scene）是对整体图像的标注，部分例子如下所示：
在这里插入图片描述
本文阐述了一些可解释性的结论：

可解释性是与坐标轴对齐（axis-aligned）的，对表示（representation）进行翻转(rotate)，网络的可解释能力会下降，但是分类性能不变。
越深的结构可解释性越好，ResNet>VGGNet>GoogleNet>AlexNet
对于训练数据集的可解释性，Places > ImageNet，因为一个场景(scene)会包含多个目标，因此有益于多个目标检测器（object detectors）出现来识别场景。
对训练条件的可解释性，训练论数越多越好。与初始化无关，dropout会增强可解释性。Batch normalization会降低可解释性，百化（whiten）操作会平滑缩放问题并且rotate中间特征的轴。

方法

在这里插入图片描述
对于给定CNN衡量语义对齐的单元。上图中探测最后卷积层的一个单元的表现。
对于一个给定的输入图像 $x$ ，某一卷积层的第 $k$ 个单元的激活图是 $A_{k}(x)$ 。单元的激活分布为 $\alpha_{k}$ ,上分位点 $T_{k}$ 通过 $P(\alpha_{k}>T_{K})=0.005$ 在每个spatial位置上。如上图所示，首先对低像素的单元激活图 $A_{k}(x)$ 进行上采样到 $S_{k}(x)$ 为输入图像像素大小，之后和输入像素的标注mask $L_{c}$ 进行比较，其中 $c$ 是某一个语义概念。 $S_{k}(x)$ 进行一个阈值的二值分割： $M_{k}(x)\equiv S_{k}(x)\geq T_{k}(x)$ 。计算 $M_{k}(x)\cap L_{c}(x)$ 通过IoU来得到单元 $k$ 对于概念 $c$ 的分数：
$IoU_{k,c}=\frac{\sum{|M_{k}(x)\cap L_{c}(x)|}}{\sum|M_{k}(x)\cup L_{c}(x)|}$
我们认为单元 $k$ 对于概念 $c$ 作为一个检测器如果 $IoU_{k,c}$ 超过了阈值。文中作者使用阈值 $0.04$ 。注意单个单元可能是多个概念的检测器，于是作者选择排名最好的label。量化某层的可解释性，我们count对齐于单元的独特概念的数量，称这个指标为the number of unique detectors，将某层可以作为detector的单元数量定义为the number of detectors.

pytorch实现

作者提供了Github链接：https://github.com/CSAILVision/NetDissect-Lite
在这里重点说明如何使用自己的模型进行dissection，注意首先要得到pre-trained权重文件。
（1）将模型module代码放在loader/下，如新建文件lenet.py:

import torch.nn as nn
import torch.nn.functional as F
import torch


class LeNet(nn.Module):
    def __init__(self, num_classes=1000):
        super(LeNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, 3),
            nn.ReLU(inplace=True))

        self.fc = nn.Linear(64, num_classes)

    def forward(self, x):
        out = self.features(x)
        out = F.adaptive_avg_pool2d(out, (1, 1))
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

(2)修改loader/下的model_loader.py，用于加载自己的lenet模型：
在settings.py里加入IS_TORCHVISION变量，用于标识是不是torchvision中已有的模型。

import settings
import torch
import torchvision
from .lenet import LeNet

def loadmodel(hook_fn):
    if settings.IS_TORCHVISION:
        if settings.MODEL_FILE is None:
            model = torchvision.models.__dict__[settings.MODEL](pretrained=True)
        else:
            checkpoint = torch.load(settings.MODEL_FILE)
            if type(checkpoint).__name__ == 'OrderedDict' or type(checkpoint).__name__ == 'dict':
                model = torchvision.models.__dict__[settings.MODEL](num_classes=settings.NUM_CLASSES)
                if settings.MODEL_PARALLEL:
                    state_dict = {str.replace(k, 'module.', ''): v for k, v in checkpoint[
                        'state_dict'].items()}  # the data parallel layer will add 'module' before each layer name
                else:
                    state_dict = checkpoint
                model.load_state_dict(state_dict)
            else:
                model = checkpoint
    else:
        checkpoint = torch.load(settings.MODEL_FILE)
        model = LeNet(num_classes=settings.NUM_CLASSES)
        param_dict = {}  # 新的pre-trained权重文件，因为直接load checkpoint可能会引起组件名称不一致问题
        for k, v in zip(model.state_dict().keys(), checkpoint['net'].keys()):
            param_dict[k] = checkpoint['net'][v]
        model.load_state_dict(param_dict)

    for name in settings.FEATURE_NAMES:
        model._modules.get(name).register_forward_hook(hook_fn)
    if settings.GPU:
        model.cuda()
    model.eval()
    return model

（3）settings.py文件设置如下：

######### global settings  #########
GPU = True                                  # running on GPU is highly suggested
TEST_MODE = False                         # turning on the testmode means the code will run on a small dataset.
CLEAN = True                               # set to "True" if you want to clean the temporary large files after generating result
MODEL = 'lenet'                          # model arch: resnet18, alexnet, resnet50, densenet161
DATASET = 'imagenet'                       # model trained on: places365 or imagenet
QUANTILE = 0.005                            # the threshold used for activation
SEG_THRESHOLD = 0.04                        # the threshold used for visualization
SCORE_THRESHOLD = 0.04                      # the threshold used for IoU score (in HTML file)
TOPN = 10                                   # to show top N image with highest activation for each unit
PARALLEL = 1                                # how many process is used for tallying (Experiments show that 1 is the fastest)
CATAGORIES = ["object", "material", "part","scene","texture","color"] # concept categories that are chosen to detect: "object", "part", "scene", "material", "texture", "color"
OUTPUT_FOLDER = "result/pytorch_"+MODEL+"_"+DATASET # result will be stored in this folder

########### sub settings ###########
# In most of the case, you don't have to change them.
# DATA_DIRECTORY: where broaden dataset locates
# IMG_SIZE: image size, alexnet use 227x227
# NUM_CLASSES: how many labels in final prediction
# FEATURE_NAMES: the array of layer where features will be extracted
# MODEL_FILE: the model file to be probed, "None" means the pretrained model in torchvision
# MODEL_PARALLEL: some model is trained in multi-GPU, so there is another way to load them.
# WORKERS: how many workers are fetching images
# BATCH_SIZE: batch size used in feature extraction
# TALLY_BATCH_SIZE: batch size used in tallying
# INDEX_FILE: if you turn on the TEST_MODE, actually you should provide this file on your own

if MODEL != 'alexnet':
    DATA_DIRECTORY = 'dataset/broden1_224'
    IMG_SIZE = 224
else:
    DATA_DIRECTORY = 'dataset/broden1_227'
    IMG_SIZE = 227


NUM_CLASSES = 1000
IS_TORCHVISION = False

if MODEL == 'resnet50':
    FEATURE_NAMES = ['layer4']
    MODEL_FILE = None
    MODEL_PARALLEL = False
elif MODEL == 'densenet169':
    FEATURE_NAMES = ['features']
    MODEL_FILE = None
    MODEL_PARALLEL=False
elif MODEL == 'lenet':
    FEATURE_NAMES = ['features']
    MODEL_FILE = '/home/ws/winycg/mbagnet/checkpoint/LeNet.pth.tar' # 预训练权重文件


if TEST_MODE:
    WORKERS = 1
    BATCH_SIZE = 4
    TALLY_BATCH_SIZE = 2
    TALLY_AHEAD = 1
    INDEX_FILE = 'index_sm.csv'
    OUTPUT_FOLDER += "_test"
else:
    WORKERS = 12
    BATCH_SIZE = 16
    TALLY_BATCH_SIZE = 16
    TALLY_AHEAD = 4
    INDEX_FILE = 'index.csv'