HW9 Explainable AI

最新推荐文章于 2025-02-19 22:42:08 发布

Raphael9900

最新推荐文章于 2025-02-19 22:42:08 发布

阅读量463

点赞数

文章标签：人工智能深度学习 python

本文链接：https://blog.csdn.net/Raphael9900/article/details/128482957

版权

文章目录

Topic I: CNN explanation

Topic I: CNN explanation

1、任务

模型：食物分类
●我们使用训练的分类器模型做一些解释
●模型结构： CNN模型
●数据集： 11类食物（HW3相同数据集）
○面包，日记产品，甜点、蛋、油炸食品，肉、面条/意大利面、大米、海鲜、汤和蔬菜/水果
●运行示例代码和完成20个问题（所有多项选择形式）
●我们将涵盖5解释方法：
○ Lime package
○ Saliency map
○ Smooth Grad
○ Filter Visualization
○ Integrated Gradients
●在这个家庭作业中，你只需要观察这10张图片。
●请确保你在你的代码中有这10张图片。
●在这些问题中，图像被标记为从0到9。
## 在这里插入图片描述

2、运行

（1）Lime

Lime是一个关于解释机器学习分类器正在做什么的包。我们可以先用它来观察模型。Lime主要方法是先训练一个线性模型，模型的输入为切块的图片，将其训练为输出与CNN模型类似，然后通过线性模型的权重来判断图片的哪一个位置比较重要。

def predict(input):
    # input: numpy array, (batches, height, width, channels)                                                                                                                                                     
    
    model.eval()                                                                                                                                                             
    input = torch.FloatTensor(input).permute(0, 3, 1, 2)                                                                                                            
    # pytorch tensor, (batches, channels, height, width)

    output = model(input.cuda())                                                                                                                                             
    return output.detach().cpu().numpy()                                                                                                                              
                                                                                                                                                                             
def segmentation(input):
    # split the image into 200 pieces with the help of segmentaion from skimage                                                                                                                   
    return slic(input, n_segments=200, compactness=1, sigma=1, start_label=1)                                                                                                              
                                                                                                                                                                             

fig, axs = plt.subplots(1, len(img_indices), figsize=(15, 8))                                                                                                                                                                 
# fix the random seed to make it reproducible
np.random.seed(16)                                                                                                                                                       
for idx, (image, label) in enumerate(zip(images.permute(0, 2, 3, 1).numpy(), labels)):                                                                                                                                             
    x = image.astype(np.double)

    explainer = lime_image.LimeImageExplainer()                                                                                                                              
    explaination = explainer.explain_instance(image=x, classifier_fn=predict, segmentation_fn=segmentation)

    lime_img, mask = explaination.get_image_and_mask(                                                                                                                         
                                label=label.item(),                                                                                                                           
                                positive_only=False,                                                                                                                         
                                hide_rest=False,                                                                                                                             
                                num_features=11,                                                                                                                              
                                min_weight=0.05                                                                                                                            
                            )
    
    axs[idx].imshow(lime_img)

plt.show()
plt.close()

在这里插入图片描述
从结果可以看到，有些图片的标记比较明显，有些的标记比较少。绿色的占大多数，红色占少数。

（2）Saliency Map

该方法对图片的每个像素求导，根据导数的绝对值大小来判断像素的重要性。具体做法是构造loss函数，输入是图片，输出是模型输入与gound truth的差异，然后对loss函数的输入（图片）求导。高亮显示输入图像中对分类任务贡献最大的像素的热图。
我们将一幅图像放入模型中，forward，然后参照标签计算损失。因此，损失涉及:图像、模型参数、标签。
一般来说，我们改变模型参数来适应“图像”和“标签”。backward时，我们计算损失对模型参数的偏微分值。

现在，我们来看另一个例子。当我们改变图像的像素值时，损失对图像的偏导数显示了损失的变化。我们可以说这意味着像素的重要性。我们可以将其可视化，以展示图像的哪一部分对模型的判断贡献最大。
什么是permute？
在pytorch中，图像张量各维度的含义是(通道，高度，宽度)
在matplotlib中，图像张量各维的含义是(高度，宽度，通道)
permute是一个变换张量维数的工具
例如，img.permute(1，2，0)表示，

0维是原张量的1维，是高度
1维是原始张量的2维，即宽度
2维是原始张量的0维，它是通道

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())
  # return torch.log(image)/torch.log(image.max())

def compute_saliency_maps(x, y, model):
  model.eval()
  x = x.cuda()

  # we want the gradient of the input x
  x.requires_grad_()
  
  y_pred = model(x)
  loss_func = torch.nn.CrossEntropyLoss()
  loss = loss_func(y_pred, y.cuda())
  loss.backward()

  # saliencies = x.grad.abs().detach().cpu()
  saliencies, _ = torch.max(x.grad.data.abs().detach().cpu(),dim=1)

  # We need to normalize each image, because their gradients might vary in scale
  saliencies = torch.stack([normalize(item) for item in saliencies])
  return saliencies

# images, labels = train_set.getbatch(img_indices)
saliencies = compute_saliency_maps(images, labels, model)

# visualize
fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for row, target in enumerate([images, saliencies]):
  for column, img in enumerate(target):
    if row==0:
      axs[row][column].imshow(img.permute(1, 2, 0).numpy())

    else:
      axs[row][column].imshow(img.numpy(), cmap=plt.cm.hot)
    
plt.show()
plt.close()

在这里插入图片描述
图中红色的点多集中在食物上。

（3）Smooth Grad

Smooth grad 的方法是，在圖片中隨機地加入 noise，然後得到不同的 heatmap，把這些 heatmap 平均起來就得到一個比較能抵抗 noisy gradient 的結果。

# Smooth grad

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())

def smooth_grad(x, y, model, epoch, param_sigma_multiplier):
  model.eval()
  #x = x.cuda().unsqueeze(0)

  mean = 0
  sigma = param_sigma_multiplier / (torch.max(x) - torch.min(x)).item()
  smooth = np.zeros(x.cuda().unsqueeze(0).size())
  for i in range(epoch):
    # call Variable to generate random noise
    noise = Variable(x.data.new(x.size()).normal_(mean, sigma**2))
    x_mod = (x+noise).unsqueeze(0).cuda()
    x_mod.requires_grad_()

    y_pred = model(x_mod)
    loss_func = torch.nn.CrossEntropyLoss()
    loss = loss_func(y_pred, y.cuda().unsqueeze(0))
    loss.backward()

    # like the method in saliency map
    smooth += x_mod.grad.abs().detach().cpu().data.numpy()
  smooth = normalize(smooth / epoch) # don't forget to normalize
  # smooth = smooth / epoch # try this line to answer the question
  return smooth

# images, labels = train_set.getbatch(img_indices)
smooth = []
for i, l in zip(images, labels):
  smooth.append(smooth_grad(i, l, model, 500, 0.4))
smooth = np.stack(smooth)
# print(smooth.shape)

fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for row, target in enumerate([images, smooth]):
  for column, img in enumerate(target):
    axs[row][column].imshow(np.transpose(img.reshape(3,128,128), (1,2,0)))

在这里插入图片描述

（4）Filter Explanation

這裡我們想要知道某一個 filter 到底認出了什麼。我們會做以下兩件事情：

Filter activation: 挑幾張圖片出來，看看圖片中哪些位置會 activate 該 filter
Filter visualization: 找出怎樣的 image 可以最大程度的 activate 該 filter

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())

layer_activations = None
def filter_explanation(x, model, cnnid, filterid, iteration=100, lr=1):
  # x: input image
  # cnnid: cnn layer id
  # filterid: which filter
  model.eval()

  def hook(model, input, output):
    global layer_activations
    layer_activations = output
  
  hook_handle = model.cnn[cnnid].register_forward_hook(hook)
#当模型通过层[cnnid]转发时，需要先调用hook函数
# hook函数保存层的输出[cnnid]
#转发后，我们将丢失和激活层

#过滤器激活:x通过过滤器将生成激活图
  model(x.cuda()) # forward

#根据函数参数给定的filterid，选择特定过滤器的激活映射
#我们只需要绘制它，这样我们就可以从图形中分离出来并保存为cpu张量
  filter_activations = layer_activations[:, filterid, :, :].detach().cpu()
  
  # Filter visualization: find the image that can activate the filter the most
  x = x.cuda()
  x.requires_grad_()
  # input image gradient
  optimizer = Adam([x], lr=lr)
  #使用优化器修改输入图像以增强过滤器的激活
  for iter in range(iteration):
    optimizer.zero_grad()
    model(x)
    
    objective = -layer_activations[:, filterid, :, :].sum()
#我们希望最大化过滤器激活的总和
#所以我们加了一个负号
    
    objective.backward()
#计算滤波器激活对输入图像的偏差值
    optimizer.step()
    # Modify input image to maximize filter activation
  filter_visualizations = x.detach().cpu().squeeze()

  # Don't forget to remove the hook
  hook_handle.remove()
  # The hook will exist after the model register it, so you have to remove it after used
  # Just register a new hook if you want to use it

  return filter_activations, filter_visualizations
images, labels = train_set.getbatch(img_indices)
filter_activations, filter_visualizations = filter_explanation(images, model, cnnid=6, filterid=0, iteration=100, lr=0.1)

fig, axs = plt.subplots(3, len(img_indices), figsize=(15, 8))
for i, img in enumerate(images):
  axs[0][i].imshow(img.permute(1, 2, 0))
# Plot filter activations
for i, img in enumerate(filter_activations):
  axs[1][i].imshow(normalize(img))
# Plot filter visualization
for i, img in enumerate(filter_visualizations):
  axs[2][i].imshow(normalize(img.permute(1, 2, 0)))
plt.show()
plt.close()

在这里插入图片描述

images, labels = train_set.getbatch(img_indices)
filter_activations, filter_visualizations = filter_explanation(images, model, cnnid=23, filterid=0, iteration=100, lr=0.1)

# Plot filter activations
fig, axs = plt.subplots(3, len(img_indices), figsize=(15, 8))
for i, img in enumerate(images):
  axs[0][i].imshow(img.permute(1, 2, 0))
for i, img in enumerate(filter_activations):
  axs[1][i].imshow(normalize(img))
for i, img in enumerate(filter_visualizations):
  axs[2][i].imshow(normalize(img.permute(1, 2, 0)))
plt.show()
plt.close()

在这里插入图片描述

（5）Integrated Gradients

IG是在图片和空白图片之间做线性插值产生多个图片，然后通过图片的模型输出对图片求导（saliency map是loss函数求导，这两者的求导函数不一样），最后将导数的结果进行平均加权。

class IntegratedGradients():
    def __init__(self, model):
        self.model = model
        self.gradients = None
        # Put model in evaluation mode
        self.model.eval()

    def generate_images_on_linear_path(self, input_image, steps):
        # Generate scaled xbar images
        xbar_list = [input_image*step/steps for step in range(steps)]
        return xbar_list

    def generate_gradients(self, input_image, target_class):
        # We want to get the gradients of the input image
        input_image.requires_grad=True
        # Forward
        model_output = self.model(input_image)
        # Zero grads
        self.model.zero_grad()
        # Target for backprop
        one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_().cuda()
        one_hot_output[0][target_class] = 1
        # Backward
        model_output.backward(gradient=one_hot_output)
        self.gradients = input_image.grad
        # Convert Pytorch variable to numpy array
        # [0] to get rid of the first channel (1,3,128,128)
        gradients_as_arr = self.gradients.data.cpu().numpy()[0]
        return gradients_as_arr

    def generate_integrated_gradients(self, input_image, target_class, steps):
        # Generate xbar images
        xbar_list = self.generate_images_on_linear_path(input_image, steps)
        # Initialize an image composed of zeros
        integrated_grads = np.zeros(input_image.size())
        for xbar_image in xbar_list:
            # Generate gradients from xbar images
            single_integrated_grad = self.generate_gradients(xbar_image, target_class)
            # Add rescaled grads from xbar images
            integrated_grads = integrated_grads + single_integrated_grad/steps
        # [0] to get rid of the first channel (1,3,128,128)
        return integrated_grads[0]

def normalize(image):
  return (image - image.min()) / (image.max() - image.min())


# put the image to cuda
images, labels = train_set.getbatch(img_indices)
images = images.cuda()

IG = IntegratedGradients(model)
integrated_grads = []
for i, img in enumerate(images):
  img = img.unsqueeze(0)
  integrated_grads.append(IG.generate_integrated_gradients(img, labels[i], 10))
fig, axs = plt.subplots(2, len(img_indices), figsize=(15, 8))
for i, img in enumerate(images):
  axs[0][i].imshow(img.cpu().permute(1, 2, 0))
for i, img in enumerate(integrated_grads):
  axs[1][i].imshow(np.moveaxis(normalize(img),0,-1))
plt.show()
plt.close()

在这里插入图片描述

（6）Embedding Visualization

我們現在有一個預訓練好的模型，並且在閱讀理解上微調過。
閱讀理解需要四個步驟﹕（這些步驟並不按照順序排列)
1、將類似的文字分羣 (根據文字在文章中的關係)
2、提取答案
3、將類似的文字分羣 (根據文字的意思)
4、從文章中尋找與問題有關的資訊

# Tokenize and encode question and paragraph into model's input format
inputs = Tokenizer(questions[QUESTION-1], contexts[QUESTION-1], return_tensors='pt') 

# Get the [start, end] positions of [question, context] in encoded sequence for plotting
question_start, question_end = 1, inputs['input_ids'][0].tolist().index(102) - 1
context_start, context_end = question_end + 2, len(inputs['input_ids'][0]) - 2

outputs_hidden_states = torch.load(f"hw9_bert/output/model_q{QUESTION}")

#####遍历所有层的隐藏状态####
#“输出_隐藏_状态”是具有13个元素的元组，第1个元素是嵌入输出，其他12个元素是层1 - 12的注意隐藏状态
for layer_index, embeddings in enumerate(outputs_hidden_states[1:]): # 1st element is skipped
 
    # "embeddings" has shape [1, sequence_length, 768], where 768 is the dimension of BERT's hidden state
    # Dimension of "embeddings" is reduced from 768 to 2 using PCA (Principal Component Analysis)
    reduced_embeddings = PCA(n_components=2, random_state=0).fit_transform(embeddings[0])

    ##### Draw embedding of each token ##### 
    for i, token_id in enumerate(inputs['input_ids'][0]):
        x, y = reduced_embeddings[i] # Embedding has 2 dimensions, each corresponds to a point
        word = Tokenizer.decode(token_id) # Decode token back to word
        # Scatter points of answer, question and context in different colors
        if word in answers[QUESTION-1].split(): # Check if word in answer
            plt.scatter(x, y, color='blue', marker='d') 
        elif question_start <= i <= question_end:
            plt.scatter(x, y, color='red')
        elif context_start <= i <= context_end:
            plt.scatter(x, y, color='green')
        else: # skip special tokens [CLS], [SEP]
            continue
        plt.text(x + 0.1, y + 0.2, word, fontsize=12) # Plot word next to its point
    
    # Plot "empty" points to show labels
    plt.plot([], label='answer', color='blue', marker='d')  
    plt.plot([], label='question', color='red', marker='o')
    plt.plot([], label='context', color='green', marker='o')
    plt.legend(loc='best') # Display the area describing the elements in the plot
    plt.title('Layer ' + str(layer_index + 1)) # Add title to the plot
    plt.show() # Show the plot

在这里插入图片描述

（7）Embedding Analysis

这部分主要观察bert针对 ‘苹果’在不同语境下的意义区分（这个是bert与传统embedding的不同，传统embedding不考虑上下文信息，同样的单词无论在哪里都是一样的embedding）。

# Index of word selected for embedding comparison. E.g. For sentence "蘋果茶真難喝", if index is 0, "蘋 is selected"
# The first line is the indexes for 蘋; the second line is the indexes for 果
select_word_index = [4, 2, 0, 8, 2, 0, 0, 4, 0, 0]
# select_word_index = [5, 3, 1, 9, 3, 1, 1, 5, 1, 1]

def l2_norm(a):
    return (a**2).sum()**0.5
    
def euclidean_distance(a, b):
    # Compute euclidean distance (L2 norm) between two numpy vectors a and b
    return l2_norm(a - b)

def cosine_similarity(a, b):
    # Compute cosine similarity between two numpy vectors a and b
    return (a*b).sum()/(l2_norm(a)*l2_norm(b) + 1e-8)

# Metric for comparison. Choose from euclidean_distance, cosine_similarity
METRIC = euclidean_distance

def get_select_embedding(output, tokenized_sentence, select_word_index):
    # The layer to visualize, choose from 0 to 12
    LAYER = 12
    # Get selected layer's hidden state
    hidden_state = output.hidden_states[LAYER][0]
    # Convert select_word_index in sentence to select_token_index in tokenized sentence
    select_token_index = tokenized_sentence.word_to_tokens(select_word_index).start
    # Return embedding of selected word
    return hidden_state[select_token_index].numpy()


# Tokenize and encode sentences into model's input format
tokenized_sentences = [tokenizer(sentence, return_tensors='pt') for sentence in sentences]

# Input encoded sentences into model and get outputs 
with torch.no_grad():
    outputs = [model(**tokenized_sentence) for tokenized_sentence in tokenized_sentences]

# Get embedding of selected word(s) in sentences. "embeddings" has shape (len(sentences), 768), where 768 is the dimension of BERT's hidden state
embeddings = [get_select_embedding(outputs[i], tokenized_sentences[i], select_word_index[i]) for i in range(len(outputs))]

# Pairwse comparsion of sentences' embeddings using the metirc defined. "similarity_matrix" has shape [len(sentences), len(sentences)]
similarity_matrix = pairwise_distances(embeddings, metric=METRIC) 

##### Plot the similarity matrix #####
plt.rcParams['figure.figsize'] = [12, 10] # Change figure size of the plot
plt.imshow(similarity_matrix) # Display an image in the plot
plt.colorbar() # Add colorbar to the plot
plt.yticks(ticks=range(len(sentences)), labels=sentences, fontproperties=myfont) # Set tick locations and labels (sentences) of y-axis
plt.title('Comparison of BERT Word Embeddings') # Add title to the plot
for (i,j), label in np.ndenumerate(similarity_matrix): # np.ndenumerate is 2D version of enumerate
    plt.text(i, j, '{:.2f}'.format(label), ha='center', va='center') # Add values in similarity_matrix to the corresponding position in the plot
plt.show() # Show the plot

在这里插入图片描述