矩阵求导
题目:
给出一个5*2*2的张量A,一个2*2*2的张量B,一个5维向量M;M中的元素为0或者1。我们执行这样的矩阵乘法
pointwise_matmul(A[1,2,3,4,5],B[M])
即让i=1:5, A[i] 对应的2*2矩阵和B[M[i]]对应的2*2矩阵两两之间做矩阵乘法,这样将得到5个结果2*2矩阵,即输出5*2*2的张量Z。
现在给出A,B,M和Z的导数dZ,请问A,B,M的导数dA,dB和dM分别为何?
解答:
这是一道概念题。熟悉多元函数微分学的同学很容易就能写出矩阵求导的公式。
除此之外,B[M[i]]中涉及权值共享,这是我们在使用tensorflow等库时常用的操作。根据计算图模型的理论,共享的权值的梯度应该是累加的。
由此我们就能很快给出解答。
#!/bin/python
# -*- coding: utf8 -*-
import sys
import os
import re
class Matrix:
def __init__(self, shape, data: list):
self.shape = shape
self.data = data
self.length = self.shape[0] * self.shape[1]
def _verify(self, location):
assert location[0] < self.shape[0] and location[1] < self.shape[1]
def __getitem__(self, location):
self._verify(location)
return self.data[location[0] * self.shape[1] + location[1]]
def __setitem__(self, location, value: int):
self._verify(location)
self.data[location[0] * self.shape[1] + location[1]] = value
def transpose(self):
transposed_data = [self.data[i * self.shape[1] + j] for j in range(self.shape[1]) for i in range(self.shape[0])]
return Matrix((self.shape[1], self.shape[0]), transposed_data)
def reshape(self, shape):
self.shape = shape
@staticmethod
def matmul(ma, mb):
assert ma.shape[1] == mb.shape[0]
mc = Matrix.zeros((ma.shape[0], mb.shape[1]))
for i in range(mc.shape[0]):
for j in range(mc.shape[1]):
mc[i, j] = sum([ma[i, k] * mb[k, j] for k in range(ma.shape[1])])
return mc
def __add__(self, other):
if