Python图像处理100问（1~25）

Rock Nut

已于 2022-12-16 12:42:25 修改

阅读量452

点赞数 1

分类专栏： Python 文章标签： python 图像处理

于 2022-12-16 12:41:34 首次发布

原文链接：https://github.com/gzr2017/ImageProcessing100Wen

版权

Python 专栏收录该内容

4 篇文章 1 订阅

订阅专栏

本文仅用作学习过程记录，转载自：
https://github.com/gzr2017/ImageProcessing100Wen

Q1: 通道交换

将RGB通道转为BGR通道

import numpy as np
import matplotlib.pyplot as plt
import cv2
from skimage import io

def rgb2bgr(img):
    return img[..., ::-1]

img_orig = io.imread('https://yoyoyo-yo.github.io/Gasyori100knock/dataset/images/imori_256x256.png')
img_bgr = rgb2bgr(img_orig)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_bgr)
plt.show()

在这里插入图片描述

Q2: 灰度化

根据
$0.2126\ R + 0.7152\ G + 0.0722\ B$

# opencv
img_gray = cv2.cvtColor(img_orig, cv2.COLOR_RGB2GRAY)

# answer
def rgb2gray(img):
    _img = img.copy().astype(np.float32)
    gray = _img[..., 0] * 0.2126 + _img[..., 1] * 0.7152 + _img[..., 2] * 0.0722
    gray = np.clip(gray, 0, 255)
    return gray.astype(np.uint8)

img_gray = rgb2gray(img_orig)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_gray, cmap='gray')
plt.show()

在这里插入图片描述

Q3: 二值化

根据
$\begin{cases} 0& (\text{if}\quad y < 128) \ 255& (\text{else}) \end{cases}$

# opencv
img_gray = cv2.cvtColor(img_orig, cv2.COLOR_RGB2GRAY)
th, img_bin = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)

# answer
def binary(img, th):
    _img = img.copy()
    _img = np.minimum(_img // th, 1) * 255
    return _img.astype(np.uint8)

img_gray = cv2.cvtColor(img_orig, cv2.COLOR_RGB2GRAY)
img_bin = binary(img_gray, 127)

plt.figure(figsize=(12, 3))
plt.subplot(1, 3, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 3, 2)
plt.title('gray')
plt.imshow(img_gray, cmap='gray')
plt.subplot(1, 3, 3)
plt.title('answer')
plt.imshow(img_bin, cmap='gray')
plt.show()

在这里插入图片描述

Q4: 大津二值化算法（Otsu’s Method）

使用大津算法来二值化图像吧。

大津算法，也被称作最大类间方差法，是一种可以自动确定二值化中阈值的算法。

从类内方差和类间方差的比值计算得来：
在这里插入图片描述
即：

类内方差： ${S_w}^2=w_0\ {S_0}^2+w_1\ {S_1}^2$
类间方差： ${S_b}^2 = w_0 \ (M_0 - M_t)^2 + w_1\ (M_1 - M_t)^2 = w_0\ w_1\ (M_0 - M_1) ^2$
图像所有像素的方差： ${S_t}^2 = {S_w}^2 + {S_b}^2 = \text{常数}$
根据以上的式子，我们用以下的式子计算分离度 $X$ ：
在这里插入图片描述
也就是说： $\arg\max\limits_{t}\ X=\arg\max\limits_{t}\ {S_b}^2$ 换言之，如果使 ${S_b}^2={w_0}\ {w_1}\ (M_0 - M_1)^2$ 最大，就可以得到最好的二值化阈值 $t$ 。

# opencv
th, img_bin = cv2.threshold(img_gray, 0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
print('threshold >>', th)

def otsu_thresh(img):
    max_vari = -1
    max_th = 0
    for th in range(1, 254):
        m0 = img[img <= th].mean() # mean class 0
        m1 = img[img > th].mean() # mean class 1
        w0 = img[img <= th].size # pixel num class 0
        w1 = img[img > th].size # pixel num class 1
        vari = w0 * w1 / ((w0 + w1) ** 2) * ((m0 - m1) ** 2) # inter class variance
        if vari > max_vari:
            max_th = th
            max_vari = vari
            
    img = binary(img, max_th)
            
    return max_th, img

th, img_bin = otsu_thresh(img_gray)
print('threshold >>', th)

plt.figure(figsize=(12, 3))
plt.subplot(1, 3, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 3, 2)
plt.title('gray')
plt.imshow(img_gray, cmap='gray')
plt.subplot(1, 3, 3)
plt.title('answer')
plt.imshow(img_bin, cmap='gray')
plt.show()

在这里插入图片描述

Q5: HSV变换

将使用 $\text{HSV}$ 表示色彩的图像的色相反转吧！

HSV即使用色相（Hue）、饱和度（Saturation）、明度（Value）来表示色彩的一种方式。
色相：将颜色使用 $0^{\circ}$ 到 $360^{\circ}$ 表示，就是平常所说的颜色名称，如红色、蓝色。色相与数值按下表对应：
红黄绿青色蓝色品红红
饱和度：是指色彩的纯度，饱和度越低则颜色越黯淡（ $0\leq S < 1$ ）；
明度：即颜色的明暗程度。数值越高越接近白色，数值越低越接近黑色（ $0\leq V < 1$ ）；
从 $\text{RGB}$ 色彩表示转换到 $\text{HSV}$ 色彩表示通过以下方式计算：
RGB的取值范围为 $[0, 1]$ ，令： $\text{Max}=\max(R,G,B)\ \text{Min}=\min(R,G,B)$ 色相： $H=\begin{cases} 0&(\text{if}\ \text{Min}=\text{Max})\ 60\ \frac{G-R}{\text{Max}-\text{Min}}+60&(\text{if}\ \text{Min}=B)\ 60\ \frac{B-G}{\text{Max}-\text{Min}}+180&(\text{if}\ \text{Min}=R)\ 60\ \frac{R-B}{\text{Max}-\text{Min}}+300&(\text{if}\ \text{Min}=G) \end{cases}$ 饱和度： $S=\text{Max}-\text{Min}$ 明度： $V=\text{Max}$ 从 $\text{HSV}$ 色彩表示转换到 $\text{RGB}$ 色彩表示通过以下方式计算： $S\ H' = \frac{H}{60}\ X = C\ (1 - |H' \mod 2 - 1|)\ (R,G,B)=(V-C)\ (1,1,1)+\begin{cases} (0, 0, 0)& (\text{if H is undefined})\ (C, X, 0)& (\text{if}\quad 0 \leq H' < 1)\ (X, C, 0)& (\text{if}\quad 1 \leq H' < 2)\ (0, C, X)& (\text{if}\quad 2 \leq H' < 3)\ (0, X, C)& (\text{if}\quad 3 \leq H' < 4)\ (X, 0, C)& (\text{if}\quad 4 \leq H' < 5)\ (C, 0, X)& (\text{if}\quad 5 \leq H' < 6) \end{cases}$ 请将色相反转（色相值加 $180$ ），然后再用 $\text{RGB}$ 色彩空间表示图片。

Q6: 减色处理

将图像的值从 $256^3$ 压缩至 $4^3$ ，即将 $\text{RGB}$ 的值只取 ${32, 96, 160, 224}$
在这里插入图片描述

def color_subtraction(img, div=4):
    th = 256 // div
    return np.clip(img // th * th + th // 2, 0, 255)

img_sub = color_subtraction(img_orig) # color subtract

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_sub)
plt.show()

在这里插入图片描述

Q7: 平均池化 (Average Pooling)

将图片按照固定大小网格分割，网格内的像素值取网格内所有像素的平均值。

我们将这种把图片使用均等大小网格分割，并求网格内代表值的操作称为池化（Pooling）。

池化操作是卷积神经网络（Convolutional Neural Network）中重要的图像处理方式。平均池化按照下式定义： $v=\frac{1}{|R|}\ \sum\limits_{i=1}^R\ v_i$ 请把大小为 $128\times128$ 的imori.jpg使用 $8\times8$ 的网格做平均池化

def pool_average(img, ksize_h=8, ksize_w=8):
    _img = img.copy().astype(np.float32)
    
    # padding
    h, w = img.shape[:2]
    outer_h = h % ksize_h
    pad_top = outer_h // 2
    pad_bottom = outer_h - pad_top
    outer_w = w % ksize_w
    pad_left = outer_w // 2
    pad_right = outer_w - pad_left
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]
    
    # filtering
    for iy in range(0, new_h, ksize_h):
        for ix in range(0, new_w, ksize_w):
            for ic in range(c):
                out[iy : iy + ksize_h, ix : ix + ksize_w, ic] = _img[iy : iy + ksize_h, ix : ix + ksize_w, ic].mean()
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_pool = pool_average(img_orig) # pooling

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_pool)
plt.show()

在这里插入图片描述

Q8: 最大池化 (Max Pooling)

网格内的值不取平均值，而是取网格内的最大值进行池化操作

def pool_max(img, ksize_h=8, ksize_w=8):
    _img = img.copy().astype(np.float32)
    
    # padding
    h, w = img.shape[:2]
    outer_h = h % ksize_h
    pad_top = outer_h // 2
    pad_bottom = outer_h - pad_top
    outer_w = w % ksize_w
    pad_left = outer_w // 2
    pad_right = outer_w - pad_left
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]
    
    # filtering
    for iy in range(0, new_h, ksize_h):
        for ix in range(0, new_w, ksize_w):
            for ic in range(c):
                out[iy : iy + ksize_h, ix : ix + ksize_w, ic] = _img[iy : iy + ksize_h, ix : ix + ksize_w, ic].max()
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_pool = pool_max(img_orig) # pooling

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_pool)
plt.show()

在这里插入图片描述

Q9: 高斯滤波

使用高斯滤波器（ $3\times3$ 大小，标准差 $\sigma=1.3$ ）来对imori_noise.jpg进行降噪处理吧！

高斯滤波器是一种可以使图像平滑的滤波器，用于去除噪声。可用于去除噪声的滤波器还有中值滤波器（参见问题十），平滑滤波器（参见问题十一）、LoG滤波器（参见问题十九）。

高斯滤波器将中心像素周围的像素按照高斯分布加权平均进行平滑化。这样的（二维）权值通常被称为卷积核（kernel）或者滤波器（filter）。

但是，由于图像的长宽可能不是滤波器大小的整数倍，因此我们需要在图像的边缘补 $0$ 。这种方法称作Zero Padding。并且权值 $g$ （卷积核）要进行归一化操作（ $\sum\ g = 1$ ）。

按下面的高斯分布公式计算权值： $g(x,y,\sigma)=\frac{1}{2\ \pi\ \sigma^2}\ e^{-\frac{x^2+y^2}{2\ \sigma^2}}$

标准差 $\sigma=1.3$ 的 $8 -$ 近邻高斯滤波器如下： $K=\frac{1}{16}\ \left[ \begin{matrix} 1 & 2 & 1 \ 2 & 4 & 2 \ 1 & 2 & 1 \end{matrix} \right]$

# OpenCV
img_noise_orig = io.imread('https://yoyoyo-yo.github.io/Gasyori100knock/dataset/images/imori_256x256_noise.png')
img_gau = cv2.GaussianBlur(img_noise_orig, (3,3), 1.3)

def filter_gaussian(img, ksize=(3, 3), sigma=1.3):
    _img = img.copy().astype(np.float32)
    ksize_h, ksize_w = ksize
    
    # padding
    h, w = img.shape[:2]
    pad_top, pad_bottom = ksize_h, ksize_h
    pad_left, pad_right = ksize_w, ksize_w
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]
    
    # prepare kernel
    k = np.zeros([ksize_h, ksize_w])
    for iy in range(ksize_h):
        for ix in range(ksize_w):
            k[iy, ix] = 1 / (2 * np.pi * (sigma ** 2)) * np.exp(- ((ix - ksize_w // 2) ** 2 + (iy - ksize_h // 2) ** 2) / (2 * sigma ** 2))
 
    k /= k.sum()

    # filtering
    for iy in range(new_h - ksize_h):
        for ix in range(new_w - ksize_w):
            for ic in range(c):
                out[iy, ix, ic] = np.sum(_img[iy : iy + ksize_h, ix : ix + ksize_w, ic] * k)
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_gau = filter_gaussian(img_noise_orig, (3, 3), 1.3)
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_noise_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_gau)
plt.show()

在这里插入图片描述

Q10: 中值滤波

中值滤波器是一种可以使图像平滑的滤波器。这种滤波器用滤波器范围内（在这里是 $3\times3$ ）像素点的中值进行滤波，请在这里也采用Zero Padding。

# OpenCV
img_med = cv2.medianBlur(img_noise_orig, 3) # median filtering

def filter_median(img, ksize=(3, 3)):
    _img = img.copy().astype(np.float32)
    ksize_h, ksize_w = ksize
    
    # padding
    h, w = img.shape[:2]
    pad_top, pad_bottom = ksize_h, ksize_h
    pad_left, pad_right = ksize_w, ksize_w
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]

    # filtering
    for iy in range(new_h - ksize_h):
        for ix in range(new_w - ksize_w):
            for ic in range(c):
                out[iy, ix, ic] = np.median(_img[iy : iy + ksize_h, ix : ix + ksize_w, ic])
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_med = filter_median(img_noise_orig, (3, 3))
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1)
plt.title('input')
plt.imshow(img_noise_orig)
plt.subplot(1, 2, 2)
plt.title('answer')
plt.imshow(img_med)
plt.show()

在这里插入图片描述

Q11: 均值滤波器

使用 $3\times3$ 的均值滤波器来进行滤波吧！
均值滤波器使用网格内像素的平均值

# opencv
img_smoth = cv2.blur(img_noise, (5, 5)) # smoothing filtering

# answer
def filter_smooth(img, ksize=(3, 3)):
    _img = img.copy().astype(np.float32)
    ksize_h, ksize_w = ksize
    
    # padding
    h, w = img.shape[:2]
    pad_top, pad_bottom = ksize_h, ksize_h
    pad_left, pad_right = ksize_w, ksize_w
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]

    # filtering
    for iy in range(new_h - ksize_h):
        for ix in range(new_w - ksize_w):
            for ic in range(c):
                out[iy, ix, ic] = np.mean(_img[iy : iy + ksize_h, ix : ix + ksize_w, ic])
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_smoth = filter_smooth(img_noise, (5, 5))
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_noise)
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_smoth)
plt.show()

在这里插入图片描述

Q12: Motion Filter

使用 $3\times3$ 的Motion Filter来进行滤波吧！

Motion Filter取对角线方向的像素的平均值，像下式这样定义： $\left[ \begin{matrix} \frac{1}{3}&0&0\ 0&\frac{1}{3}&0\ 0 & 0& \frac{1}{3} \end{matrix} \right]$

# answer 1
def filter_motion(img, k_size=(3, 3)):
    kernel = np.zeros(k_size) 
    kernel[range(k_size[0]), range(k_size[0])] = 1 / k_size[0]
    return cv2.filter2D(img, -1, kernel) 
  
img_motion = filter_motion(img_orig, k_size=(5, 5))

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_orig)
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_motion)
plt.show()

在这里插入图片描述

# answer 2
def filter_motion(img, k_size=(3, 3)):
    _img = img.copy().astype(np.float32)
    ksize_h, ksize_w = k_size
    
    # padding
    h, w = img.shape[:2]
    pad_top, pad_bottom = ksize_h, ksize_h
    pad_left, pad_right = ksize_w, ksize_w
    
    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]

    # define kernel
    k = np.zeros(k_size) 
    k[range(k_size[0]), range(k_size[0])] = 1 / k_size[0]

    # filtering
    for iy in range(new_h - ksize_h):
        for ix in range(new_w - ksize_w):
            for ic in range(c):
                out[iy, ix, ic] = np.sum(_img[iy : iy + ksize_h, ix : ix + ksize_w, ic] * k)
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_motion = filter_motion(img_orig, (5, 5))
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_orig)
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_motion)
plt.show()

在这里插入图片描述

Q13: MAX-MIN滤波器

使用MAX-MIN滤波器来进行滤波吧。

MAX-MIN滤波器使用网格内像素的最大值和最小值的差值对网格内像素重新赋值。通常用于边缘检测。

边缘检测用于检测图像中的线。像这样提取图像中的信息的操作被称为特征提取。

边缘检测通常在灰度图像上进行。

# answer
def filter_max_min(img, ksize=(5, 5)):
    _img = img.copy().astype(np.float32)
    ksize_h, ksize_w = ksize
    
    # padding
    h, w = img.shape[:2]
    pad_top, pad_bottom = ksize_h, ksize_h
    pad_left, pad_right = ksize_w, ksize_w
    
    if len(_img.shape) == 2:
        _img = np.expand_dims(_img, axis=-1)

    _img = np.pad(_img, [(pad_top, pad_bottom), (pad_left, pad_right), (0, 0)], 'edge')
    out = np.zeros_like(_img)
    
    new_h, new_w = out.shape[:2]
    c = 1 if len(out.shape) == 2 else out.shape[2]

    # filtering
    for iy in range(new_h - ksize_h):
        for ix in range(new_w - ksize_w):
            for ic in range(c):
                out[iy, ix, ic] = _img[iy : iy + ksize_h, ix : ix + ksize_w, ic].max() - _img[iy : iy + ksize_h, ix : ix + ksize_w, ic].min()
            
    out = out[pad_top : pad_top + h, pad_left : pad_left + w]
    return np.clip(out, 0, 255).astype(np.uint8)

img_mm = filter_max_min(img_gray, (5, 5))
plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_gray, cmap='gray')
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_mm[..., 0], cmap='gray')
plt.show()

在这里插入图片描述

Q14: 差分滤波器

使用 $3\times3$ 的差分滤波器来进行滤波吧。

差分滤波器对图像亮度急剧变化的边缘有提取效果，可以获得邻接像素的差值。
在这里插入图片描述

纵向： $K=\left[ \begin{matrix} 0&-1&0\ 0&1&0\ 0&0&0 \end{matrix} \right]$ 横向： $K=\left[ \begin{matrix} 0&0&0\ -1&1&0\ 0&0&0 \end{matrix} \right]$

# answer
def filter_diff(img, mode='y'):
    kernel = np.array([[0, 0, 0], [-1, 1, 0], [0, 0, 0]])
    if mode == 'y':
        kernel = np.array([[0, -1, 0], [0, 1, 0], [0, 0, 0]])
    return cv2.filter2D(img, -1, kernel) 
  
img_diff_x = filter_diff(img_gray)
img_diff_y = filter_diff(img_gray, mode='y')

plt.figure(figsize=(12, 3))
plt.subplot(1, 3, 1); plt.title('input'); plt.imshow(img_orig, cmap='gray')
plt.subplot(1, 3, 2); plt.title('answer x'); plt.imshow(img_diff_x, cmap='gray')
plt.subplot(1, 3, 3); plt.title('answer y'); plt.imshow(img_diff_y, cmap='gray')
plt.show()

在这里插入图片描述

Q15: Sobel滤波器

使用 $3\times3$ 的Sobel滤波器来进行滤波吧。

Sobel滤波器可以提取特定方向（纵向或横向）的边缘，滤波器按下式定义：
在这里插入图片描述

纵向： $K=\left[ \begin{matrix} 1&2&1\ 0&0&0\ -1&-2&-1 \end{matrix} \right]$ 横向： $K=\left[ \begin{matrix} 1&0&-1\ 2&0&-2\ 1&0&-1 \end{matrix} \right]$

# opencv
img_sobel_x = cv2.Sobel(img_gray, cv2.CV_32F, 1, 0, ksize=3)
img_sobel_x -= img_sobel_x.min()  # normalize > [0, 1]
img_sobel_x /= img_sobel_x.max()
img_sobel_y = cv2.Sobel(img_gray, cv2.CV_32F, 0, 1, ksize=3)
img_sobel_y -= img_sobel_y.min()  # normalize > [0, 1]
img_sobel_y /= img_sobel_y.max()

Q16: Prewitt滤波器

使用 $3\times3$ 的Prewitt滤波器来进行滤波吧。

Prewitt滤波器是用于边缘检测的一种滤波器，使用下式定义：
在这里插入图片描述

纵向： $K=\left[ \begin{matrix} -1&-1&-1\ 0&0&0\ 1&1&1 \end{matrix} \right]$ 横向： $K=\left[ \begin{matrix} -1&0&-1\ -1&0&1\ -1&0&1 \end{matrix} \right]$

# answer
def filter_prewitt(img, k_size=(3, 3), mode='x'):
    k = np.zeros(k_size)
    if mode == 'x':
        k[:, 0] = 1
        k[:, -1] = -1
    else:
        k[0] = 1
        k[-1] = -1
    return cv2.filter2D(img, -1, k) 
  
img_prewitt_x = filter_prewitt(img_gray)
img_prewitt_y = filter_prewitt(img_gray, mode='y')

plt.figure(figsize=(12, 3))
plt.subplot(1, 3, 1); plt.title('input'); plt.imshow(img_orig, cmap='gray')
plt.subplot(1, 3, 2); plt.title('answer x'); plt.imshow(img_prewitt_x, cmap='gray')
plt.subplot(1, 3, 3); plt.title('answer y'); plt.imshow(img_prewitt_y, cmap='gray')
plt.show()

在这里插入图片描述

Q17: Laplacian滤波器

使用Laplacian滤波器来进行滤波吧。

Laplacian滤波器是对图像亮度进行二次微分从而检测边缘的滤波器。由于数字图像是离散的， $x$ 方向和 $y$ 方向的一次微分分别按照以下式子计算： $I_x(x,y)=\frac{I(x+1,y)-I(x,y)}{(x+1)-x}=I(x+1,y)-I(x,y)\ I_y(x,y) =\frac{I(x, y+1) - I(x,y)}{(y+1)-y}= I(x, y+1) - I(x,y)$ 因此二次微分按照以下式子计算： $\begin{align*} &I_{xx}(x,y) \ =& \frac{I_x(x,y) - I_x(x-1,y)}{(x+1)-x} \ =& I_x(x,y) - I_x(x-1,y)\ =&[I(x+1, y) - I(x,y)] - [I(x, y) - I(x-1,y)]\ =& I(x+1,y) - 2\ I(x,y) + I(x-1,y) \end{align*}$ 同理： $I_{yy}(x,y)=I(x,y+1)-2\ I(x,y)+I(x,y-1)$ 特此，Laplacian 表达式如下： $\begin{align*} &\nabla^2\ I(x,y)\ =&I_{xx}(x,y)+I_{yy}(x,y)\ =&I(x-1,y) + I(x,y-1) - 4 * I(x,y) + I(x+1,y) + I(x,y+1) \end{align*}$ 如果把这个式子表示为卷积核是下面这样的： $\left[ \begin{matrix} 0&1&0\ 1&-4&1\ 0&1&0 \end{matrix} \right]$

# opencv
img_lapl = cv2.Laplacian(img_gray, cv2.CV_32F)
img_lapl -= img_lapl.min()  # normalize > [0, 1]
img_lapl /= img_lapl.max()

# answer
def filter_laplacian(img):
    k = np.array([[0, 1, 0], [1, -4, 1], [0, 1, 0]])
    out = cv2.filter2D(img.astype(np.float32), -1, k)
    out -= out.min() # normalize > [0, 1]
    out /= out.max() 
    return out

img_lapl = filter_laplacian(img_gray)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_gray, cmap='gray')
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_lapl, cmap='gray')
plt.show()

在这里插入图片描述

Q18: Emboss滤波器

使用Emboss滤波器来进行滤波吧。

Emboss滤波器可以使物体轮廓更加清晰，按照以下式子定义： $\left[ \begin{matrix} -2&-1&0\ -1&1&1\ 0&1&2 \end{matrix} \right]$

# answer
def filter_emboss(img):
    k = np.array([[-2, -1, 0], [-1, 1, 1], [0, 1, 2]])
    out = cv2.filter2D(img.astype(np.float32), -1, k)
    out -= out.min() # normalize > [0, 1]
    out /= out.max() 
    return out

img_emboss = filter_emboss(img_gray)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_gray, cmap='gray')
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_emboss, cmap='gray')
plt.show()

在这里插入图片描述

Q19: LoG滤波器

使用LoG 滤波器，来对imori_noise.jpg检测边缘吧！

LoG即高斯-拉普拉斯（Laplacian of Gaussian）的缩写，使用高斯滤波器使图像平滑化之后再使用拉普拉斯滤波器使图像的轮廓更加清晰。

为了防止拉普拉斯滤波器计算二次微分会使得图像噪声更加明显，所以我们首先使用高斯滤波器来抑制噪声。

LoG 滤波器使用以下式子定义： $\text{LoG}(x,y)=\frac{x^2 + y^2 - s^2}{2 \ \pi \ s^6} \ e^{-\frac{x^2+y^2}{2\ s^2}}$

# answer
def filter_LoG(img, k_size=(3, 3), sigma=1.2):
    # kernel
    k = np.zeros(k_size, dtype=np.float32)
    pad_x = k_size[0] // 2
    pad_y = k_size[1] // 2
    for x in range(-pad_y, -pad_y + k_size[1]):
        for y in range(-pad_x, -pad_x + k_size[0]):
            k[y + pad_y, x + pad_y] = (x ** 2 + y ** 2 - 2 * (sigma ** 2)) * np.exp( - (x ** 2 + y ** 2) / (2 * (sigma ** 2)))
    k /= (2 * np.pi * (sigma ** 6))
    k /= k.sum()
 
    out = cv2.filter2D(img.astype(np.float32), -1, k)
    out -= out.min() # normalize > [0, 1]
    out /= out.max() 
    return out

img_log = filter_LoG(img_noise, k_size=(5, 5), sigma=3)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_noise, cmap='gray')
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_log, cmap='gray')
plt.show()

在这里插入图片描述

Q20: 直方图

使用Matplotlib来绘制imori_dark.jpg的直方图吧！

直方图显示了不同数值的像素出现的次数。在Matplotlib中有hist()函数提供绘制直方图的接口。

plt.hist(img_dark.ravel(), bins=255, rwidth=0.8, range=(0, 255))
plt.show()

在这里插入图片描述

Q21: 直方图归一化

来归一化直方图吧。

有时直方图会偏向一边。

比如说，数据集中在 $0$ 处（左侧）的图像全体会偏暗，数据集中在 $255$ 处（右侧）的图像会偏亮。

如果直方图有所偏向，那么其**动态范围（ dynamic range ）**就会较低。

为了使人能更清楚地看见图片，让直方图归一化、平坦化是十分必要的。

这种归一化直方图的操作被称作灰度变换（Grayscale Transformation）。像素点取值范围从 $[c, d]$ 转换到 $[a, b]$ 的过程由下式定义。这回我们将imori_dark.jpg的灰度扩展到 $[0, 255]$ 范围：
在这里插入图片描述

# answer
def hist_normalize(img, a, b):
    c, d = img.min(), img.max()
    # if c <= xin < d
    out = (b - a) / (d - c) * (img - c) + a
    # if xin < c
    out[img < c] = a
    # if xin > d
    out[img > d] = b
    return np.clip(out, 0, 255).astype(np.uint8)    

img_dark_hist_norm = hist_normalize(img_dark, a=0, b=255) # smoothing filtering

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_dark)
plt.subplot(1, 2, 2); plt.title('hist normalize'); plt.imshow(img_dark_hist_norm)
plt.show()

在这里插入图片描述

# histogram をみると画素が分散した
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1); plt.hist(img_dark.ravel(), bins=255); plt.xlim([0, 255])
plt.subplot(1, 2, 2); plt.hist(img_dark_hist_norm.ravel(), bins=255); plt.xlim([0, 255])
plt.show()

在这里插入图片描述

Q22: 直方图平坦化

让直方图的平均值 $m_0=128$ ，标准差 $s_0=52$ 吧！

这里并不是变更直方图的动态范围，而是让直方图变得平坦。

可以使用下式将平均值为 $m$ 标准差为 $s$ 的直方图变成平均值为 $m_0$ 标准差为 $s_0$ 的直方图： $x_{out}=\frac{s_0}{s}\ (x_{in}-m)+m_0$

# answer
def hist_scaleshift(img, m, s):
    m0, s0 = img.mean(), img.std()
    out = s / s0 * (img - m0) + m
    return np.clip(out, 0, 255).astype(np.uint8)
    
img_dark_hist_scaleshift = hist_scaleshift(img_dark, m=128, s=50) # smoothing filtering

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_dark)
plt.subplot(1, 2, 2); plt.title('hist scale shift'); plt.imshow(img_dark_hist_scaleshift)
plt.show()

在这里插入图片描述

# histogram をみると画素が分散した
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1); plt.hist(img_dark.ravel(), bins=255); plt.xlim([0, 255])
plt.subplot(1, 2, 2); plt.hist(img_dark_hist_norm.ravel(), bins=255); plt.xlim([0, 255])
plt.show()

在这里插入图片描述

Q23: 直方图均衡化

来让均匀化直方图吧！

直方图均衡化是使直方图变得平坦的操作，是不需要计算上面的问题中的平均值、标准差等数据使直方图的值变得均衡的操作。

均衡化操作由以下式子定义。 $S$ 是总的像素数； $Z_{max}$ 是像素点的最大取值（在这里是 $255$ ）； $h (z)$ 表示取值为 $z$ 的累积分布函数： $\frac{Z_{max}}{S} \ \sum\limits_{i=0}^z\ h(i)$

# opencv # opencvのは1チャンネルしか受け付けないらしい
img_dark_hist_equ = cv2.equalizeHist(img_dark_gray)

# answer
def hist_equ(img):
    out = np.zeros_like(img, dtype=np.float32)

    s = img.size
    x_max = 255
    h = 0

    for i in range(256):
        h += (img == i).sum()
        out[img == i] = x_max / s * h

    return np.clip(out, 0, 255).astype(np.uint8)

img_dark_gray_hist_equ = hist_equ(img_dark_gray)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_dark_gray, cmap='gray')
plt.subplot(1, 2, 2); plt.title('answer'); plt.imshow(img_dark_gray_hist_equ, cmap='gray')
plt.show()

对于RGB图像

img_dark_hist_equ = hist_equ_rgb(img_dark)

plt.figure(figsize=(12, 3))
plt.subplot(1, 2, 1); plt.title('input'); plt.imshow(img_dark)
plt.subplot(1, 2, 2); plt.title('hist equalize'); plt.imshow(img_dark_hist_equ)
plt.show()

在这里插入图片描述

# histogram をみると画素が分散した
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1); plt.hist(img_dark.ravel(), bins=255); plt.xlim([0, 255])
plt.subplot(1, 2, 2); plt.hist(img_dark_hist_equ.ravel(), bins=255); plt.xlim([0, 255])
plt.show()

在这里插入图片描述

Q24: 伽马校正

对imori_gamma.jpg进行伽马校正（ $c = 1$ ， $g = 2.2$ ）吧！

伽马校正用来对照相机等电子设备传感器的非线性光电转换特性进行校正。如果图像原样显示在显示器等上，画面就会显得很暗。伽马校正通过预先增大 RGB 的值来排除显示器的影响，达到对图像修正的目的。

由于下式引起非线性变换，在该式中， $x$ 被归一化，限定在 $[0, 1]$ 范围内。 $c$ 是常数， $g$ 为伽马变量（通常取 $2.2$ ）： $x' = c\ {I_{in}}^ g$ 因此，使用下面的式子进行伽马校正： $I_{out} ={\frac{1}{c}\ I_{in}} ^ {\frac{1}{g}}$

# answer 
def gamma_corr(img, c, g):
    out = (1 / c * (img / 255)) ** (1 / g)
    return np.clip(out * 255, 0, 255).astype(np.uint8)
    
# RGBだと画像がめっちゃきれいになる
img_gamma_corr = gamma_corr(img_gamma, c=1, g=2.2)

plt.figure(figsize=(12, 3))
plt.subplot(1, 3, 1); plt.title('input'); plt.imshow(img_gamma)
plt.subplot(1, 3, 2); plt.title('gamma correction'); plt.imshow(img_gamma_corr)
plt.subplot(1, 3, 3); plt.title('orig'); plt.imshow(img_orig)
plt.show()

在这里插入图片描述

Q25: 最近邻插值

使用最邻近插值将图像放大 $1.5$ 倍吧！

最近邻插值在图像放大时补充的像素取最临近的像素的值。由于方法简单，所以处理速度很快，但是放大图像画质劣化明显。

使用下面的公式放大图像吧！ $I^{'}$ 为放大后图像， $I$ 为放大前图像， $a$ 为放大率，方括号是四舍五入取整操作：

# opencv
img_nn = cv2.resize(img_orig, (int(img_orig.shape[1] * 1.5), int(img_orig.shape[0] * 1.5)), 
                    interpolation=cv2.INTER_NEAREST)

# answer
def nn_inter(img, a, b):
    out_h = int(img.shape[0] * a)
    out_w = int(img.shape[1] * b)
    out = np.zeros([out_h, out_w, img.shape[2]], dtype=np.uint8)

    xs, ys = np.meshgrid(range(out_h), range(out_w))
    out[ys, xs] = img[np.round(ys / a).astype(int), np.round(xs / b).astype(int)]
    return out