【计算机视觉】CS131学习笔记#0

鹏程不会飞

已于 2023-03-30 21:27:47 修改

阅读量466

点赞数

分类专栏：计算机视觉文章标签： python cv

于 2021-08-14 20:39:28 首次发布

本文链接：https://blog.csdn.net/qq_56199570/article/details/119705626

版权

计算机视觉专栏收录该内容

2 篇文章 1 订阅

订阅专栏

CS131学习笔记#0

1.Numpy入门

图像识别处理的本质实质上是矩阵运算，python的numpy库正是进行此类运算，故对numpy进行学习是进行图像学习之前必要的一步。

通常使用 import numpy as np来使用numpy包

1.1 matrix一般创建方法

不能创建空array
创建array的一般方法：y = np.array([[1,2,3,4,5], [6,7,8,9,10]])
读取大小：y.shape
创建零矩阵：np.zero((3,3))#创建大小为3*3的0矩阵
创建单位阵：identity = np.identity(3)
创建全一矩阵：ones = np.ones((2,2))

1.2 Broadcasting和np.mean的使用

import numpy as np
#如果我们想要将任一个矩阵的行平均值调整到0：
matrix = 10*np.random.rand(4,5)
row_means = matrix.mean(axis = 1).reshape((4,1))
matrix = matrix - row_means
print(matrix)
#axis 不设置值，对 m*n 个数求均值，返回一个实数
#axis = 0：压缩行，对各列求均值
#axis =1 ：压缩列，对各行求均值

1.3 numpy.random 使用

numpy.random.randient使用

#low、high、size三个参数。默认high是None,如果只有low，那范围就是[0,low)。如果有high，范围就是[low,high)。
#返回随机的整数，位于半开区间 [low, high)。
>>> np.random.randint(2, size=10)
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])

>>> np.random.randint(1, size=10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

>>> np.random.randint(5, size=(2, 4))
array([[4, 0, 2, 1],
       [3, 2, 2, 0]])

numpy.random.rand使用

#通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1)，不包括1。 
>>> np.random.rand(3,2)
array([[ 0.14022471,  0.96360618],  
       [ 0.37601032,  0.25528411],  
       [ 0.49313049,  0.94909878]])

numpy.random.randn使用

#randn函数返回一个或一组样本，具有标准正态分布。
np.random.randn(2,4)
array([[ 0.27795239, -2.57882503,  0.3817649 ,  1.42367345],
      [-1.16724625, -0.22408299,  0.63006614, -0.41714538]])
#标准正态分布—-standard normal distribution
#标准正态分布又称为u分布，是以0为均值、以1为标准差的正态分布，记为N（0，1）。

1.4 boolean masks使用

基本判断

import numpy as np
array = np.array(range(20)).reshape((4,5))#4*5,1-20的矩阵
print(array)

output = array > 10
output
#out：
array([[False, False, False, False, False],
       [False, False, False, False, False],
       [False,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True]])

array[output]
#out：
array([11, 12, 13, 14, 15, 16, 17, 18, 19])

#可以进行多元的判断
mask = (array < 5) | (array > 15)
#mask = array < 5 | array > 15
mask
#out：
array([[ True,  True,  True,  True,  True],
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False,  True,  True,  True,  True]])

实际运用

#Given a matrix, change all of the negative values to zero
matrix = 2*np.random.rand(5, 5) - 1#（-1，1）均匀分布的随机矩阵
### SOLUTION ###
mask = matrix < 0
print(mask)
matrix[mask] = 0#将mask中的值全部赋为0
print(matrix)

1.5 reshape用法

#when your reshape, by default you fill the new array by rows
x = np.linspace(1, 12, 6)
print(x)
#[ 1.   3.2  5.4  7.6  9.8 12. ]

x = x.reshape((3,2)) #does not reshape in place!
print(x)
#[[ 1.   3.2]
# [ 5.4  7.6]
# [ 9.8 12. ]]

print(x.reshape(-1))#-1相当于默认值，将由系统自动算出
[ 1.   3.2  5.4  7.6  9.8 12. ]

print(x.reshape(2,-1))
[[ 1.   3.2  5.4]
 [ 7.6  9.8 12. ]]

1.6 numpy深度拷贝

我们发现，numpy中矩阵赋值都是浅拷贝，且拷贝的是地址，例如：

array = np.linspace(1, 10, 10)
array
#out
#array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

dup = array
dup
#out
#array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

array[0] = 100
dup
#out
#array([100.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.,  10.])

print(id(array))
print(id(dup))
#out
#120645422176
#120645422176

可以看到，使用’='赋值后，array和dup指向的地址是相同的，因此修改其中一个另一个也会变动，为避免这样的情况，我们采用numpy的深拷贝方法。

#using copy
import copy
array = np.linspace(1, 10, 10)
dup = copy.deepcopy(array)
#此处也可以写为dup = np.copy(array)或者dup = array.copy()
print(id(array))
print(id(dup))
array[0] = 100
dup

120649253152
120664256640
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

错误方法：使用slicing syntax [:]

#slicing
array = np.linspace(1, 10, 10)
dup = array[:]
print(id(array))
print(id(dup))
array[0] = 100
dup

2552119240816
2552119240336
[100.   2.   3.   4.   5.   6.   7.   8.   9.  10.]

我们发现，虽然地址不同，dup和array的值还是一起变化的

2.Pyplot入门

2.1 pyplot

import matplotlib.pyplot as plt

x = np.arange(10)**2
print(x)
plt.plot(x)
plt.show()

输出表格如下:

请添加图片描述

当然也可以添加许多细节：

plt.figure(figsize = (15,15))
plt.plot(x)
plt.title("This is a graph")
plt.xlabel("this is the x label")
plt.ylabel("this is the y label")
plt.show()

请添加图片描述

2.2 散点图

x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)#散点图

3.图像读取

3.1 图片的基本组成

众所周知，图像是由RGB三种颜色层组合而成的，对于一个图像，我们可以用（h,w,3）的矩阵来表示。其中h、w分别表示图片的高度和宽度，3就代表三个基本颜色通道，每一个颜色通道对应矩阵所存放的数字表示该颜色光的灰度值，三种不同灰度颜色构成的像素点拼接成了五彩斑斓的图像。

灰度值不是字面意义上的"黑白"值，而是指某颜色的亮度值，如图片的某一层（400，300，1）表示红色通道矩阵，红色的灰度值就储存在其中。

每一个颜色通道都储存了其对应的灰度值，最后三层通道的灰度值像调色一样一调就可以根据三种原色中不同颜色的灰度值调出图片中想要的颜色。

如图中随便取一点，显示时，将该点红色灰度值放入R通道，绿色灰度值放入G通道，蓝色灰度值放入B通道，三种灰度信息就可以像调色一样调出对应的颜色。

总而言是，通道表示不同颜色的通道，（当然也有一些特殊通道，如alpha通道，存储图片透明度信息。）灰度值表示某颜色的亮度。

3.2 图片读取的代码实现

def display(img):
    plt.figure(figsize = (5,5))
    plt.imshow(img)#显示图片
    plt.axis('off')#不显示坐标轴
    plt.show() 
def load(image_path):
    out = io.imread(image_path)
    #读取图片，第二个参数默认为False，为True时是灰度图
    out = out.astype(np.float64) / 255
    return out

from skimage import io
img = load('image1.jpg')
display(img)

def rgb_exclusion(image, channel):
    out = image.copy()
    if channel == 'R':
        out[:, :, 0] = 0
    elif channel == 'G':
        out[:, :, 1] = 0
    elif channel == 'B':
        out[:, :, 2] = 0
    return out#关闭RGB通道中的一个

注：scikit-image是基于scipy的一款图像处理包，它将图片作为numpy数组进行处理，是非常好的数字图像处理工具，有待后续学习，下表供参考。

子模块名称	主要实现功能
io	读取、保存和显示图片或视频
data	提供一些测试图片和样本数据
color	颜色空间变换
filters	图像增强、边缘检测、排序滤波器、自动阈值等
draw	操作于numpy数组上的基本图形绘制，包括线条、矩形、圆和文本等
transform	几何变换或其它变换，如旋转、拉伸和拉东变换等
morphology	形态学操作，如开闭运算、骨架提取等
exposure	图片强度调整，如亮度调整、直方图均衡等
feature	特征检测与提取等
measure	图像属性的测量，如相似性或等高线等
segmentation	图像分割
restoration	图像恢复
util	通用函数

参考

https://zhuanlan.zhihu.com/p/360220467

https://www.jianshu.com/p/be7af337ffcd

4.线性代数

4.1解线性方程：

For example, say we wanted to solve the linear system
$A x = b$

A = np.array([[1, 1], [2, 1]])
b = np.array([[1], [0]])
#This function takes parameters A, b, and returns x such that Ax =b. 
x = np.linalg.solve(A, b)

4.2 求最佳拟合线（best fit）：

Linear regression finds the “line of best fit” by minimizing the residual sum of squares.

If we have n datapoints ${(x_1, y_1), ... ,(x_n, y_n)\}$ , the objective function takes the form $\Sigma_{i = 1}^n (y_i - f(x_i))^2$ where $f(x_i) = \theta_0 + \theta_1 x_1 + ... +\theta_n x_n$

It turns out the parameters such that the loss function is minimized are given by the closed form solution $\theta = (X^T X)^{-1} X^T y$

对于这个算法我们回顾一下线性代数中的最小二乘法：

对于误差： $E(x)=||b-Ax||^2$ ，求x使得E最小，其中A为列满秩矩阵，p为b在A列空间上的投影。

由勾股定理：

$Ax-p||^2+||b-p||^2=||b-Ax||^2$

对于任意b:

$||b-Ax||^2 \geq ||b-p||^2$

因此，E取最小值当且仅当取得x，使得 $A x = p$ .由于A列满秩，方程有唯一解：

$\hat{x} = (A^TA)^{-1}A^Tb$

接下来我们用python进行一些实际操作

首先获得一些点

x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)#axis=1表示按列拼接
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)
plt.show()

请添加图片描述

求出系数 $\theta$

theta = np.linalg.lstsq(x, y, rcond=None)[0]
#leastsquare最小二乘求解，利用内置函数
print(theta)

[0.72037691 1.55604653]

或者：

theta = np.linalg.inv(x.T.dot(x)).dot(x.T).dot(y)
#利用公式求解最小二程
print(theta)

得到相同结果： [0.72037691 1.55604653]

最后绘出直线：

plt.scatter(x[:,0], y)
plt.plot(x[:,0], x[:,0]*theta[0] + theta[1])

请添加图片描述

鹏程不会飞

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
【计算机视觉】CS131学习笔记#0

CS131学习笔记#0CS131预学习内容
复制链接

扫一扫

专栏目录