Python计算机视觉编程第一章

asdaasddsa

已于 2024-08-07 16:17:55 修改

阅读量715

点赞数 15

文章标签：计算机视觉人工智能

于 2024-08-04 18:31:46 首次发布

本文链接：https://blog.csdn.net/zxsdss/article/details/140717497

版权

一、PIL

PIL（图像处理类库）提供了通用的图像处理功能，以及大量有用的基本图像操作。PIL中最重要的一个模块为Image。首先是使用PIL库打开图像文件。

from PIL import Image
from matplotlib import pyplot as plt
img = Image.open(r'D:\test\pil.png')
img2 = Image.open(r'D:\test\pil.png').convert('L')
# img2.show() # 显示图片
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(121), plt.imshow(img), plt.title('原始图像'), plt.axis('off')
plt.subplot(122), plt.imshow(img2, plt.cm.gray), plt.title('灰度图像'), plt.axis('off')
plt.show()

首先需要下载PIL库，可以使用下述命令进行下载。PIL库在高版本中的名字为pillow，所以下载的包名为pillow，初次之外，为了把两幅图片放在一个框里面，使用了matplotlib库里面的plt函数

pip install pillow -i https://pypi.tuna.tsinghua.edu.cn/simple some-package

PIL可以根据文件扩展名来判定图像的格式。

下面是一些PIL中的一些方法即作用：

open()：创建PIL图像对象

save()：保存图像

convert()：转换图像的颜色

thumbnail()：接受一个元组参数，然后将图像转化为符合元组参数指定大小的缩略图

crop()：从一幅图像中裁剪指定区域

resize()：调整图像尺寸大小

rotate()：旋转图像

接着进行一个上述相关方法的实验，PIL的下载方法如上所示，展示的代码为：

from PIL import Image
from matplotlib import pyplot as plt
img = Image.open(r'D:\test\pil.png') # 打开图片
img1 = Image.open(r'D:\test\pil.png').convert('L') # 转换为灰度图
img2 = Image.open(r'D:\test\pil.png')
img2.thumbnail((50, 100)) # 生成大小为100*100的缩略图
img3 = img.crop((0, 0, 100, 100)) # 剪切图片
img4 = img.rotate(45) # 旋转45度
img5 = img.resize((512, 512)) # 重设大小
# img2.show() # 显示图片
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(231), plt.imshow(img), plt.title('原始图像'), plt.axis('off')
plt.subplot(232), plt.imshow(img1, plt.cm.gray), plt.title('灰度图像'), plt.axis('off')
plt.subplot(233), plt.imshow(img2), plt.title('缩略图'), plt.axis('off')
plt.subplot(234), plt.imshow(img3), plt.title('剪切图片'), plt.axis('off')
plt.subplot(235), plt.imshow(img4), plt.title('旋转图像'), plt.axis('off')
plt.subplot(236), plt.imshow(img5), plt.title('resize后图像'), plt.axis('off')
plt.show()

结果为：

在运行的过程中出现了如下的错误

发现是使用thumbnail()方法时不能使用img2 = img.thumbnail((50, 100))这样的形式，而是使用img.thumbnail((50, 100))然后展示img，不然就会报上述错误。原因在于Matplotlib的inshow函数期望的图像数据应该需要能转换为浮点类型，但得到的dtype为object类型的。

二、Matplotlib

Matplotlib是处理数学运算、绘制图表、在图像上绘制点、直线和曲线的很好的类库，其具有比PIL更好的绘图功能。它的下载方法也同前面PIL的下载方法相似，可以使用pip或者conda下载。

对于大多数计算机视觉任务而言，只需要使用到其中的几个绘图命令。重要的是用点和线来表示一些事物。一个例子为：

from PIL import Image
from pylab import *

img = array(Image.open(r'D:\test\pil.png'))

imshow(img)#绘制图像
#设置点
x=[100,100,400,400]
y=[200,500,200,500]

plot(x,y,'r*')#使用红色星状标记绘制点
plt.axis('off') # 不显示坐标轴
plot(x[:2],y[:2])#绘制连接前两个点的线

title('plotting:"pil.png"') 
show()

结果为：

图像的轮廓和直方图

实验代码为：

from PIL import Image
import numpy as np
from pylab import plt

img = Image.open(r'D:\test\pil.png')
img = img.convert('L')
img = np.array(img)

plt.figure(figsize=(10, 5))

plt.subplot(121)
plt.gray()
plt.contour(img, origin='image')
plt.title('contour Image')
plt.axis('off')

plt.subplot(122)
plt.hist(img.flatten(),128)
plt.title('Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.xlim([0, 256])
plt.show()

结果为：

代码中使用PIL读取并将图像转化为灰度图像，接着使用contour和hist函数绘制图像的轮廓图像和直方图，其中contour是绘制图像等高线图的方法。

交互式批注

实验代码为：

from PIL import Image
import numpy as np
from pylab import plt
img = Image.open(r'D:\test\pil.png')
img = np.array(img)
plt.imshow(img)
print ('请选择两个点')
x = plt.ginput(2)
print ('选择的点为:',x)
plt.show()

结果为：

上述实验使用的是ginput实现交互式批注，及再图片中点击的点会在cmd界面中显示出来。

三、NumPy

NumPy中包含有数组对象和线性代数函数，其可以帮助我们实现矩阵乘积、转置、解方程系统、向量乘积和归一化，为图像变形、对变化进行建模、图像分类、图像聚类提供了基础。下载方法同上面的使用pip方法下载一样。

1.图像数组表示

NumPy中数组对象是多维的，可用来表示向量、矩阵和图像。数组中所有元素必须具有相同的数据类型，否则会按照数据的类型自动确定。

可以使用代码显示出一个图像数据的类型信息，代码为：

from PIL import Image
import numpy as np
from pylab import plt
img = Image.open(r'D:\test\pil.png')
img1 = img.convert('L')
img = np.array(img)
img1 = np.array(img1,'f')
print(img.shape, img.dtype)
print(img1.shape, img1.dtype)

结果为：

每行的第一个元组表示图像数组的大小（分别为行、列、颜色通道），紧接着的字符串表示数组元素的数据类型。将图像转化为灰度图像后在形状元组中只有两个数值，因为其没有颜色信息。

数组中的元素可以使用下标进行访问，多个数组元素可以使用数组切片方式进行访问。切片方式返回的是以指定间隔下标访问的该数组的元素值。对于灰度图像，一些切片方式的例子为：

2.灰度变换

实验代码为：

from PIL import Image
from matplotlib import pyplot as plt
import numpy as np

img = Image.open(r'D:\test\pil.png')
img = img.convert('L')
img = np.array(img)

img1 = 255 - img 
img2 = (100.0/255)*img + 100 
img3 = 255.0*(img/255.0)**2

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(141), plt.imshow(img, plt.cm.gray), plt.title('原始灰度图像'), plt.axis('off')
plt.subplot(142), plt.imshow(img1, plt.cm.gray), plt.title('反向处理'), plt.axis('off')
plt.subplot(143), plt.imshow(img2, plt.cm.gray), plt.title('区间变换'), plt.axis('off')
plt.subplot(144), plt.imshow(img3, plt.cm.gray), plt.title('求平方'), plt.axis('off')
plt.show()

结果为：

第二张图是原始图像以及进行三种灰度变换之后的最小的和最大的像素值。array的相反操作可以使用PIL的fromarray()函数实现。

3.图像缩放和直方图均衡化

图像缩放主要使用的是PIL中的resize方法来实现的。

图像灰度变换的一个例子为直方图均衡化，其指的是将一幅图像的灰度直方图变平，是变换后的图像中每个灰度值的分布概率都相同。它可以对图像灰度值进行归一化并且增强图像的对比度。直方图均衡化的变换函数是图像中像素值的累积分布函数（cdf），其将像素值的范围映射到目标范围的归一化操作。其实现代码为：

from PIL import Image
from matplotlib import pyplot as plt
import numpy as np

def histeq(img,nbr_bins=256):
    imhist,bins = np.histogram(img.flatten(),nbr_bins,normed=True)  
    cdf = imhist.cumsum()
    cdf = 255 * cdf / cdf[-1]
    img_eq = np.interp(img.flatten(),bins[:-1],cdf)
    return img_eq.reshape(img.shape), cdf

img = Image.open(r'D:\test\pil.png')
img = img.convert('L')
img1 = np.array(img)
img_eq,cdf = histeq(img1)


plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(221), plt.imshow(img, plt.cm.gray), plt.title('原始灰度图像'), plt.axis('off')
plt.subplot(222), plt.imshow(img_eq, plt.cm.gray), plt.title('结果图像'), plt.axis('off')
plt.subplot(223), plt.hist(img1.flatten(),128), plt.title('原始灰度图像直方图'), plt.xlabel('Pixel Intensity'), plt.ylabel('Frequency'), plt.xlim([0, 256])
plt.subplot(224), plt.hist(img_eq.flatten(),128), plt.title('结果图像直方图'), plt.xlabel('Pixel Intensity'), plt.ylabel('Frequency'), plt.xlim([0, 256])
plt.show()

结果为：

再实现的过程中出现了如下的问题：

发现是因为再使用flatten()方法时，使用该方法的参数应该为矩阵形式，而本人再此使用的是图片的格式，因此需要先将图片转化为矩阵后再调用该方法。

4.图像平均

图像平均操作可以减少图像的噪声，通常用于艺术特效。其方法就是将图像简单的相加，然后除以图像的数目，来计算平均图像。

其实现的代码为：

def compute_average(imgs):
    averageimg = np.array(Image.open(imgs[0],'f'))
    for imgname in imgs[1:]:
        try:
            averageimg += np.array(Image.open(imgname,'f'))
        except:
            print('error')
    averageimg /= len(imgs)
    return np.array(averageimg, 'uint8')

也可以通过mean()函数计算平均图像。mean()函数需要将所有的图像堆积到一个数组中。

5.图像的主成分分析(PCA)

PCA是一个降维的方法，其可以在使用尽可能少维数的前提下，尽量多保持训练数据的信息。PCA 产生的投影矩阵可以被视为将原始坐标变换到现有的坐标系，坐标系中的各个坐标按照重要性递减排列。首先需要使用flatten()方法将原图像转化为一维向量表示；在计算主方向前，需要使用SVD（奇异值分解）方法来计算主成分，但当维度较大时，通常不使用SVD分解。

PCA操作的代码为：

import os
from matplotlib import pyplot as plt
import numpy as np
from scipy import linalg
from scipy.cluster.vq import *
# import imtools
from PIL import Image
from matplotlib import pyplot as plt
import numpy as np

def get_imlist(path):

    return [os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]
def pca(X):
    num_data,dim = X.shape
    mean_X = X.mean(axis=0)
    X = X - mean_X
    if dim>num_data:
        M = np.dot(X,X.T)
        e,EV = linalg.eigh(M)
        tmp = np.dot(X.T,EV)
        V = tmp[::-1]
        S = np.sqrt(e)[::-1]
        for i in range(V.shape[1]):
            V[:,i] /= S
    else:
        U,S,V = linalg.svd(X)
        V = V[:num_data]
    return V,S,mean_X

imglist = get_imlist('D:/BaiduNetdiskDownload/a_selected_thumbs/')
img = Image.open(imglist[0])
m,n = img.size
# m,n = img.shape[0:2]
imgnbr = len(imglist)
imgmatrix = np.array([np.array(Image.open(img)).flatten() for img in imglist], 'f')

V,S,mean_X = pca(imgmatrix)

plt.figure()
plt.gray()
plt.subplot(2,4,1)
plt.axis('off')
plt.imshow(mean_X.reshape(m,n))
for i in range(7):
    plt.subplot(2,4,i+2)
    plt.imshow(V[i].reshape(m,n))
    plt.axis('off')
plt.show()

结果为：

6.pickle模块

pickle模块可以用于保存一些结果或者数据以便后续使用，其可以接受大部分的python对象，并将其转换为字符串表示，这个过程叫做封装；从字符串表示中重构该对象，叫做拆封。最基本的使用方法如下所示：

import pandas as pd  
import pickle  
import numpy as np


path = r'D:\idmDownload\glove.6B\1234.csv'
df = pd.read_csv(path)
data_list = df.values.tolist()
data = np.array(data_list)

f = open(r'D:\test\test.pkl', 'wb')
pickle.dump(data, f)
f.close()

结果为：

上述过程是将一个csv的文件信息存储到test.pkl文件中。接着是读取该pkl文件的实验：

import pandas as pd  
import pickle  
import numpy as np


path = r'D:\idmDownload\glove.6B\1234.csv'
df = pd.read_csv(path)
data_list = df.values.tolist()

data = np.array(data_list)

f = open(r'D:\test\test.pkl', 'rb')
data1 = pickle.load(f)
print(data1)
f.close()

结果为：

读取操作需要将原本open函数中的wb修改为rb然后就可以进行读取了。一般使用with语句处理文件的读写操作，其可以自动打开和关闭文件，只需要对上面的下半部分处理代码修改为如下就可以了：

with open(r'D:\test\test.pkl', 'wb') as file:  
    pickle.dump(data, file)

with open(r'D:\test\test.pkl', 'rb') as file:  
    data1 = pickle.load(file)

其次Numpy还具有读写文本文件的简单函数，对于数据不复杂的数据结构，效果比较显著。保存一个数组x到文件中，可以使用savetxt()方法；读取使用loadtxt()方法。

四、SciPy

SciPy建立在NumPy的基础之上，用于数值运算。

1.图像模糊

图像的高斯模糊是非常经典的图像卷积的例子，图像模糊在本质上就是将灰度图像I和一个高斯核进行卷积操作。高斯模糊也是图像插值操作、兴趣点计算等其他图像处理操作的一部分。

SciPy使用scipy.ndimage.filters模块进行滤波，该模块使用快速一维分离的方式计算卷积。该实验的代码为：

from PIL import Image
from numpy import *
from scipy.ndimage import filters
from matplotlib import pyplot as plt

img = array(Image.open(r'D:\test\pil.png').convert('L'))
img1 = filters.gaussian_filter(img, 5)

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(121), plt.imshow(img, plt.cm.gray), plt.title('原始灰度图像'), plt.axis('off')
plt.subplot(122), plt.imshow(img1, plt.cm.gray), plt.title('结果图像'), plt.axis('off')

plt.show()

结果为：

其中的gaussian_filter()方法的第二个参数控制的是图片模糊的程度，如果想要对彩色图像进行模糊，只需要对每一个颜色通道进行高斯模糊，可将上述操作过程的代码修改为：

img = Image.open(r'D:\test\pil.png')
img2 = array(img)
img2 = zeros(img2.shape)
for i in range(3):
    img2[:, :, i] = filters.gaussian_filter(img2[:, :, i], 5)
img2 = uint8(img2)

2.图像导数

图像强度的变化可以使用灰度图像I的x和y方向导数 $I_x$ 和 $I_y$ 来表示，图像的梯度向量为 $\bigtriangledown I=\left [ I_x,I_y \right ]^T$ ，其中有两个重要的属性一为梯度的大小：描述图像强度变化的强弱；二为梯度的角度：描述图像在每个像素上强度变化最大的方向。图像导数的计算可以使用卷积来实现，对于其中的部分值可以使用Prewitt滤波器或者Sobel滤波器，其可以用scipy.ndimage.filters模块的标准卷积操作实现，其代码为：

from PIL import Image
from numpy import *
from scipy.ndimage import filters
from matplotlib import pyplot as plt, units

img = Image.open(r'D:\test\pil.png')
img1 = array(img.convert('L'))
imgx = zeros(img1.shape)
filters.sobel(img1, 1, imgx)
imgy = zeros(img1.shape)
filters.sobel(img1, 0, imgy)
magnitude = sqrt(imgx ** 2 + imgy ** 2)

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(141), plt.imshow(img, plt.cm.gray), plt.title('原始灰度图像'), plt.axis('off')
plt.subplot(142), plt.imshow(imgx, plt.cm.gray), plt.title('x导数图像'), plt.axis('off')
plt.subplot(143), plt.imshow(imgy, plt.cm.gray), plt.title('y导数图像'), plt.axis('off')
plt.subplot(144), plt.imshow(magnitude, plt.cm.gray), plt.title('梯度图像'), plt.axis('off')

plt.show()

结果为：

上述方法存在着滤波器的尺寸需要随着图像分辨率的变化而变化的缺陷，可以使用高斯导数滤波器来进行替代。只需要将原本获得x和y方向导数的代码修改为：

sigma = 5
imgx = zeros(img1.shape)
filters.gaussian_filter(img1, (sigma,sigma), (0,1), imgx)
imgy = zeros(img1.shape)
filters.gaussian_filter(img1, (sigma,sigma), (1,0), imgy)

3.形态学：对象计数

形态学是度量和分析基本形状的图像处理方法的基本框架和集合，通常用于处理二值图像，二值图像指的是图像的每个像素只能两个取值。其通常是在计算物体数目，或者度量其大小时，对一幅图像进行阈值化后的结果。

scipy.ndimage里面的morphology模块可以实现形态学操作，并且可以使用其中的measurements模块实现二值图像的计数和度量功能。实验的代码为：

from matplotlib import pyplot as plt
from scipy.ndimage import measurements,morphology
from PIL import Image
from numpy import *

img = Image.open(r'D:\test\pil.png')
img = img.convert('L')
img = array(img)
img = 1*(img<128)
label,num = measurements.label(img)

img_open = morphology.binary_opening(img,ones((9,5)),iterations=2)
labels_open, num_open = measurements.label(img_open)
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(221), plt.imshow(img, plt.cm.gray), plt.title('二值化图像'), plt.axis('off')
plt.subplot(222), plt.imshow(label, plt.cm.gray), plt.title('labels数组图像'), plt.axis('off')
plt.subplot(223), plt.imshow(img_open, plt.cm.gray), plt.title('开操作后图像'), plt.axis('off')
plt.subplot(224), plt.imshow(labels_open, plt.cm.gray), plt.title('开操作后labels数组图像'), plt.axis('off')
plt.show()

print ("开操作前对象个数为：",num)
print ("开操作后对象个数为：",num_open)

结果为：

上述代码使用阈值化操作来确保输入的图像为二值图像，使用label()函数来寻找单个的物体，并按照所属的对象将整数标签给像素赋值。使用binary_opening()函数实现形态学开操作，该函数的第二个参数分别表示在y和x方向上使用多少像素进行开操作。

binary_closing()函数可以实现形态学闭操作。

4.其他模块

可以使用io模块的savemat()函数和loadmat()函数保存和读取.mat文件，使用misc模块的imsave()函数将数组保存到文件中，其次misc模块还包含有lena测试图像。

5.图像去噪

图像去噪是在去除图像噪声的同时，尽可能地保留图像细节和结构的处理技术。ROF模型可以使得处理后的图像更加平滑，同时保存图像边缘和结构信息。一幅灰度图像的全变差（TV）定义为梯度范数之和。在本质上，ROF模型使去噪后的图像像素值平坦变化，但是在区域边缘上，允许去噪后的图像像素值跳跃变化。其实验代码为：

from matplotlib import pyplot as plt
from PIL import Image
from numpy import *
from numpy import random
from scipy.ndimage import filters
def denoise(img, U_init, tolerance=0.1, tau=0.125, K=100):
    m,n = img.shape
    U = U_init
    px = img
    py = img
    error = 1

    while(error > tolerance):
        Uold = U
        Gradx = roll(U,-1,axis=1) - U
        Grady = roll(U,-1,axis=0) - U
        pxNew = px + (tau/K)*Gradx
        pyNew = py + (tau/K)*Grady
        NormNew = maximum(1, sqrt(pxNew**2 + pyNew**2))

        px = pxNew/NormNew
        py = pyNew/NormNew

        Rxpx = roll(px,1,axis=1)
        Rypy = roll(py,1,axis=0)
        DivP = (px-Rxpx) + (py-Rypy)
        U = img + K*DivP
        error = linalg.norm(U-Uold)/sqrt(m*n)
    return U,img - U

# img = zeros((500,500))
# img[100:400,100:400] = 128
# img[200:300,200:300] = 255
# img = img + 30*random.standard_normal((500,500))
img = Image.open(r'D:\test\pil.png').convert('L')
img = array(img)


U,T = denoise(img, img)
G = filters.gaussian_filter(img, 10)

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.subplot(131), plt.imshow(img, plt.cm.gray), plt.title('原始图像'), plt.axis('off')
plt.subplot(132), plt.imshow(G, plt.cm.gray), plt.title('经过高斯模糊后图像'), plt.axis('off')
plt.subplot(133), plt.imshow(U, plt.cm.gray), plt.title('去噪后图像'), plt.axis('off')
plt.show()

结果为：

实验所使用的代码基本都为书本上的代码，对其进行了部分的修改。

总结

本章主要是对于计算机视觉中需要使用到的一些基本库的介绍，包括其作用和使用方法等。并给我们详细的举出了例子。通过对于这些例子进行实际操作的实验，可以更好的理解以及在实际的代码编写中去灵活的使用这些库以及相应的函数。

asdaasddsa

关注

15
点赞
踩
22

收藏

觉得还不错? 一键收藏
0
评论
Python计算机视觉编程第一章

本章主要是对于计算机视觉中需要使用到的一些基本库的介绍，包括其作用和使用方法等。并给我们详细的举出了例子。通过对于这些例子进行实际操作的实验，可以更好的理解以及在实际的代码编写中去灵活的使用这些库以及相应的函数。I_xI_y。
复制链接

扫一扫