博主: Chris_yg
学海无涯,欢迎讨论,共同进步
本文将主要介绍二维卷积公式,性质,计算方法以及Python实现。
1. 二维卷积公式及性质
在图像处理中,图片由离散的像素组成,卷积运算通常用于表示某一像素邻域的加权和,二维卷积的离散形式如下:
g(m,n)=f∗h=∑k=−∞∞∑l=−∞∞f(m,n)h(m−k,n−l)
g
(
m
,
n
)
=
f
∗
h
=
∑
k
=
−
∞
∞
∑
l
=
−
∞
∞
f
(
m
,
n
)
h
(
m
−
k
,
n
−
l
)
卷积运算满足以下性质:
- 交换律: f∗h=h∗f f ∗ h = h ∗ f
- 结合律: f∗(g∗h)=(f∗g)∗h f ∗ ( g ∗ h ) = ( f ∗ g ) ∗ h
- 分配律: f∗(g+h)=f∗g+f∗h f ∗ ( g + h ) = f ∗ g + f ∗ h
2.二维卷积的计算方法及python实现
(1) 利用原始公式进行计算,需要4层嵌套循环:
设 f 大小为
(M1,N1)
(
M
1
,
N
1
)
, h 大小为
(M2,N2)
(
M
2
,
N
2
)
,卷积公式可表示如下:
g(m,n)=f∗h=h∗f=∑k=0M1−1∑l=−0N1−1h(m,n)f(m−k,n−l)
g
(
m
,
n
)
=
f
∗
h
=
h
∗
f
=
∑
k
=
0
M
1
−
1
∑
l
=
−
0
N
1
−
1
h
(
m
,
n
)
f
(
m
−
k
,
n
−
l
)
其中, 0≤m<M1+M2−1,0≤m<N1+N2−1 0 ≤ m < M 1 + M 2 − 1 , 0 ≤ m < N 1 + N 2 − 1
利用上述公式计算所得如下图中full区域所示,实际上在图像处理中,我们所需的为same区域,即保持图像大小在卷积前后保持不变。
import numpy as np
def conv_nested(image, kernel):
"""A naive implementation of convolution filter.
This is a naive implementation of convolution using 4 nested for-loops.
This function computes convolution of an image with a kernel and outputs
the result that has the same shape as the input image.
Args:
image: numpy array of shape (Hi, Wi)
kernel: numpy array of shape (Hk, Wk)
Returns:
out: numpy array of shape (Hi, Wi)
"""
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))
temp_m = np.zeros((Hi+Hk-1, Wi+Wk-1)) # 所得为 full 矩阵
for i in range(Hi+Hk-1):
for j in range(Wi+Wk-1):
temp = 0
# 通常来说,卷积核的尺寸远小于图片尺寸,同时卷积满足交换律,为了加快运算,可用h*f 代替 f*h 进行计算
for m in range(Hk):
for n in range(Wk):
if ((i-m)>=0 and (i-m)<Hi and (j-n)>=0 and (j-n)<Wi):
temp += image[i-m][j-n] * kernel[m][n]
temp_m[i][j] = temp
# 截取出 same 矩阵 (输出尺寸同输入)
for i in range(Hi):
for j in range(Wi):
out[i][j] = temp_m[int(i+(Hk-1)/2)][int(j+(Wk-1)/2)]
return out
(2) 旋转卷积核180°,原始图像进行zero-padding,随后滑动卷积核加权求和:
此过程计算效率比第一种方法高。卷积核的旋转可通过两次翻转完成(分别对x,y轴进行),代码如下:
def zero_pad(image, pad_height, pad_width):
""" Zero-pad an image.
Ex: a 1x1 image [[1]] with pad_height = 1, pad_width = 2 becomes:
[[0, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 0]] of shape (3, 5)
Args:
image: numpy array of shape (H, W)
pad_width: width of the zero padding (left and right padding)
pad_height: height of the zero padding (bottom and top padding)
Returns:
out: numpy array of shape (H+2*pad_height, W+2*pad_width)
"""
H, W = image.shape
out = None
out = np.zeros((H+2*pad_height, W+2*pad_width))
out[pad_height:pad_height+H, pad_width:pad_width+W] = image
return out
def conv_fast(image, kernel):
""" An efficient implementation of convolution filter.
This function uses element-wise multiplication and np.sum()
to efficiently compute weighted sum of neighborhood at each
pixel.
Hints:
- Use the zero_pad function you implemented above
- There should be two nested for-loops
- You may find np.flip() and np.sum() useful
Args:
image: numpy array of shape (Hi, Wi)
kernel: numpy array of shape (Hk, Wk)
Returns:
out: numpy array of shape (Hi, Wi)
"""
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))
pad_height = Hk // 2
pad_width = Wk // 2
image_padding = zero_pad(image, pad_height, pad_width)
kernel_flip = np.flip(np.flip(kernel, 0), 1)
for i in range(Hi):
for j in range(Wi):
out[i][j] = np.sum(np.multiply(kernel_flip, image_padding[i:(i+Hk), j:(j+Wk)]))
return out
(3) 利用傅里叶变换
主要利用
F(f∗h)=F(f)⋅F(h)
F
(
f
∗
h
)
=
F
(
f
)
·
F
(
h
)
f∗h=F−1(F(f)⋅F(h))
f
∗
h
=
F
−
1
(
F
(
f
)
·
F
(
h
)
)
其中,F表示傅里叶变换, F−1 F − 1 为傅里叶逆变换
还没写代码,有兴趣的请自行编写。