矢量化操作

约定


本文中的”向量”均指一维数组/张量,”矩阵”均值二维数组/张量

前言


在ML当中,向量和矩阵非常常见。由于之前使用C语言的惯性,本人经常会从标量的角度考虑向量和矩阵的运算,也就是用for循环来完成向量或矩阵的运算。实际上,for循环的风格比python内置的操作或pytorch的函数要更加耗费时间并且实现起来更复杂。

接下来我将会通过加法和乘法演示矢量化的效果。

加法


首先定义一个矢量加法函数备用:

def tensor_add(a, b):
    '''
    only up to 2D tensor
    do not support broadcasting
    '''
    if a.dim() == 1 and b.dim() == 1: # both are vectors
        length = a.size(0)
        c = torch.zeros(length, dtype=a.dtype)
        for i in range(length):
            c[i] = a[i] + b[i]
        return c
    elif a.dim() == 2 and b.dim() == 2: # both are matrices
        rows, cols = a.size()
        c = torch.zeros(rows, cols, dtype=a.dtype)
        for i in range(rows):
            for j in range(cols):
                c[i, j] = a[i, j] + b[i, j]
        return c
    else:
        raise Exception('Unsupported tensor dimensions')

然后比较for循环和python内置加号的性能:

n = 1000
a = torch.ones(n)
b = torch.ones(n)

start_time = time.time()
c = tensor_add(a, b)
end_time = time.time()
print("Time taken(for-loop):", end_time - start_time)

start_time = time.time()
c = a + b
end_time = time.time()
print("Time taken(vectorized):", end_time - start_time)
Time taken(for-loop): 0.005979299545288086
Time taken(vectorized): 0.0001308917999267578

矩阵加法:

A = torch.ones((200, 200))
B = torch.ones((200, 200))

start_time = time.time()
C = tensor_add(A, B)
end_time = time.time()
print("Time taken(for-loop):", end_time - start_time)

start_time = time.time()
C = A + B
end_time = time.time()
print("Time taken(vectorized):", end_time - start_time)
Time taken(for-loop): 0.214155912399292
Time taken(vectorized): 0.0006866455078125

我们可以看到随着张量的尺寸变大,for循环和矢量化间的时间消耗差距也在变大。

乘法


对于乘法,我们也是首先定义一个函数来实施for循环乘法:

def tensor_multiply(a, b):
    '''
    only up to 2D tensor
    for vector-vector, cross product is computed
    for matrix-matrix, matrix multiplication is computed
    do not support matrix-vector multiplication
    '''
    if a.dim() == 1 and b.dim() == 1: # both are vectors
        if a.size(0)!= b.size(0):
            raise Exception('Vector dimensions do not match')
        length = a.size(0)
        c = torch.zeros((length, length), dtype=a.dtype)
        for i in range(length):
            for j in range(length):
                c[i, j] = a[i] * b[j]
        return c
    elif a.dim() == 2 and b.dim() == 2: # both are matrices
        if a.size(1)!= b.size(0):
            raise Exception('Matrix dimensions do not match')
        rows = a.size(0)
        cols = b.size(1)
        c = torch.zeros(rows, cols, dtype=a.dtype)
        for i in range(rows):
            for j in range(cols):
                for k in range(a.size(1)): # 3 loops for matrix multiplication
                    c[i, j] += a[i, k] * b[k, j]
        return c
    else:
        raise Exception('Unsupported tensor dimensions')

然后,比较for循环和矢量化方式的性能:

start_time = time.time()
c = tensor_multiply(a, b)
end_time = time.time()
print("Time taken(for-loop):", end_time - start_time)

start_time = time.time()
c = torch.outer(a, b) # outer product is different from cross product https://blog.csdn.net/Dust_Evc/article/details/127502272
end_time = time.time()
print("Time taken(vectorized):", end_time - start_time)
Time taken(for-loop): 3.7471888065338135
Time taken(vectorized): 0.00025653839111328125

对于矩阵:

start_time = time.time()
C = tensor_multiply(A, B)
end_time = time.time()
print("Time taken(for-loop):", end_time - start_time)

start_time = time.time()
C = A@B
end_time = time.time()
print("Time taken(vectorized):", end_time - start_time)
Time taken(for-loop): 54.04433226585388
Time taken(vectorized): 0.011148929595947266

总结


我们可以看到,使用python内置方法或pytorch函数可以极大地加速向量和矩阵运算。

资源


本博客的jupyter notebook文件

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

lupinjia

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值