SRDP学习记录

最新推荐文章于 2023-05-10 19:19:24 发布

白_12138

最新推荐文章于 2023-05-10 19:19:24 发布

阅读量1.6k

点赞数 1

本文链接：https://blog.csdn.net/weixin_49343078/article/details/110957536

版权

SRDP学习记录——第二周

本周内容，初步认识PyTorch和PyThon中的图像处理

一、什么是PyTorch？

PyTorch是一个python库，它主要提供了两个高级功能：

GPU加速的张量计算；
构建在反向自动求导系统上的深度神经网络。

1. 定义数据

一般定义数据使用torch.Tensor ， tensor的意思是张量，是数字各种形式的总称

import torch

# 可以是一个数
x = torch.tensor(12138)
print(x)

tensor(12138)

# 可以是一维数组（向量）
x = torch.tensor([1,2,1,3,8])
print(x)

tensor([1, 2, 1, 3, 8])

# 可以是二维数组（矩阵）
x = torch.ones(2,3)
print(x)

tensor([[1., 1., 1.],
[1., 1., 1.]])

# 可以是任意维度的数组（张量）
x = torch.ones(2,3,4)
print(x)

tensor([[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]]])

# 创建一个空张量
x = torch.empty(5,3)
print(x)

tensor([[2.4035e-35, 0.0000e+00, 3.3631e-44],
[0.0000e+00, nan, 2.3694e-38],
[1.1578e+27, 1.1362e+30, 7.1547e+22],
[4.5828e+30, 1.2121e+04, 7.1846e+22],
[9.2198e-39, 7.0374e+22, 0.0000e+00]])

# 创建一个随机初始化的张量
x = torch.rand(5,3)
print(x)

tensor([[0.6881, 0.8338, 0.2896],
[0.3478, 0.9676, 0.1412],
[0.1889, 0.8596, 0.5601],
[0.1458, 0.2747, 0.0177],
[0.4060, 0.1154, 0.1707]])

# 创建一个全0的张量，里面的数据类型为 long
x = torch.zeros(5,3,dtype=torch.long)
print(x)

tensor([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])


# 基于现有的tensor，创建一个新tensor，
# 从而可以利用原有的tensor的dtype，device，size之类的属性信息
y = x.new_ones(5,3)   #tensor new_* 方法，利用原来tensor的dtype，device
print(y)

tensor([[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])

z = torch.randn_like(x, dtype=torch.float)    # 利用原来的tensor的大小，但是重新定义了dtype
print(z)

tensor([[ 0.0188, 0.8351, -0.3130],
[ 2.5916, 1.6422, -0.1717],
[-0.6523, -1.0881, -0.9645],
[-0.9857, -0.3897, 1.2709],
[ 0.9610, 3.4165, -0.3595]])

2.定义操作

凡是用Tensor进行各种运算的，都是Function。最终，还是需要用Tensor来进行计算的，计算无非是：

基本运算，加减乘除，求幂求余；
布尔运算，大于小于，最大最小；
线性运算，矩阵乘法，求模，求行列式；
基本运算包括： abs/sqrt/div/exp/fmod/pow ，及一些三角函数 cos/ sin/ asin/ atan2/ cosh，及 ceil/round/floor/trunc 等具体在使用的时候可以百度一下；

布尔运算包括： gt/lt/ge/le/eq/ne，topk, sort, max/min；

线性计算包括： trace, diag, mm/bmm，t，dot/cross，inverse，svd 等；

不再多说，需要使用的时候百度一下即可。下面用具体的代码案例来学习。

# 创建一个 2x4 的tensor
m = torch.Tensor([[2, 5, 3, 7],[4, 2, 1, 9]])

print(m.size(0), m.size(1), m.size(), sep=' -- ')

2 – 4 – torch.Size([2, 4])

# 返回 m 中元素的数量
print(m.numel())

# 返回 第0行，第2列的数
print(m[0][2])

tensor(3.)

# 返回 第1列的全部元素
print(m[:, 1])

tensor([5., 2.])

# 返回 第0行的全部元素
print(m[0, :])

tensor([2., 5., 3., 7.])

# Create tensor of numbers from 1 to 5
# 注意这里结果是1到4，没有5
v = torch.arange(1.0, 5)
print(v)

tensor([1., 2., 3., 4.])

# Scalar product
m @ v

tensor([49., 47.])

# Calculated by 1*2 + 2*5 + 3*3 + 4*7
m[[0], :] @ v

tensor([49.])

# Add a random tensor of size 2x4 to m
m + torch.rand(2, 4)

tensor([[2.9538, 5.6675, 3.3406, 7.7079],
[4.3208, 2.2670, 1.3296, 9.1788]])

# 转置，由 2x4 变为 4x2
print(m.t())

# 使用 transpose 也可以达到相同的效果，具体使用方法可以百度
print(m.transpose(0, 1))

tensor([[2., 4.],
[5., 2.],
[3., 1.],
[7., 9.]])
tensor([[2., 4.],
[5., 2.],
[3., 1.],
[7., 9.]])

# returns a 1D tensor of steps equally spaced points between start=3, end=8 and steps=20
torch.linspace(3, 8, 20)

tensor([3.0000, 3.2632, 3.5263, 3.7895, 4.0526, 4.3158, 4.5789, 4.8421, 5.1053,
5.3684, 5.6316, 5.8947, 6.1579, 6.4211, 6.6842, 6.9474, 7.2105, 7.4737,
7.7368, 8.0000])

from matplotlib import pyplot as plt

# matlabplotlib 只能显示numpy类型的数据，下面展示了转换数据类型，然后显示
# 注意 randn 是生成均值为 0， 方差为 1 的随机数
# 下面是生成 1000 个随机数，并按照 100 个 bin 统计直方图
plt.hist(torch.randn(1000).numpy(), 100);

在这里插入图片描述

# 当数据非常非常多的时候，正态分布会体现的非常明显
plt.hist(torch.randn(10**6).numpy(), 100);

在这里插入图片描述

# 创建两个 1x4 的tensor
a = torch.Tensor([[1, 2, 3, 4]])
b = torch.Tensor([[5, 6, 7, 8]])

# 在 0 方向拼接 （即在 Y 方各上拼接）, 会得到 2x4 的矩阵
print( torch.cat((a,b), 0))

tensor([[1., 2., 3., 4.],
[5., 6., 7., 8.]])


# 在 1 方向拼接 （即在 X 方各上拼接）, 会得到 1x8 的矩阵
print( torch.cat((a,b), 1))

tensor([[1., 2., 3., 4., 5., 6., 7., 8.]])

二、Python中的图像处理

我们将写一些非常简单的代码来学习 python 中的图像处理。主要包括：理解图像类型，进行一些基本的图像分割操作。

1. 下载并显示图像

!wget https://scpic.chinaz.net/files/pic/pic9/201905/bpic11895.jpg

–2020-12-10 11:58:09-- https://scpic.chinaz.net/files/pic/pic9/201905/bpic11895.jpg
Resolving scpic.chinaz.net (scpic.chinaz.net)… 112.90.135.151
Connecting to scpic.chinaz.net (scpic.chinaz.net)|112.90.135.151|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 116545 (114K) [image/jpeg]
Saving to: ‘bpic11895.jpg.2’

bpic11895.jpg.2 100%[===================>] 113.81K 73.2KB/s in 1.6s

2020-12-10 11:58:12 (73.2 KB/s) - ‘bpic11895.jpg.2’ saved [116545/116545]

import matplotlib
import numpy as np
import matplotlib.pyplot as plt

import skimage
from skimage import data
from skimage import io

colony = io.imread('bpic11895.jpg')
print(type(colony))
print(colony.shape)

<class ‘numpy.ndarray’>
(433, 650, 3)

# Plot all channels of a real image
plt.subplot(121)
plt.imshow(colony[:,:,:])
plt.title('3-channel image')
plt.axis('off')

# Plot one channel only
plt.subplot(122)
plt.imshow(colony[:,:,0])
plt.title('1-channel image')
plt.axis('off');

在这里插入图片描述

2. 读取并改变图像像素值

# Get the pixel value at row 10, column 10 on the 10th row and 20th column
camera = data.camera()
print(camera[10, 20])

# Set a region to black
camera[30:100, 10:100] = 0
plt.imshow(camera, 'gray')

153
<matplotlib.image.AxesImage at 0x7f0d4302b1d0>
在这里插入图片描述

# Set the first ten lines to black
camera = data.camera()
camera[:10] = 0
plt.imshow(camera, 'gray')

<matplotlib.image.AxesImage at 0x7f0d43491240>
在这里插入图片描述

# Set to "white" (255) pixels where mask is True
camera = data.camera()
mask = camera < 80
camera[mask] = 255
plt.imshow(camera, 'gray')

<matplotlib.image.AxesImage at 0x7f0d43464be0>
在这里插入图片描述

# Change the color for real images
cat = data.chelsea()
plt.imshow(cat)

<matplotlib.image.AxesImage at 0x7f0d43448518>
在这里插入图片描述

# Set brighter pixels to red
red_cat = cat.copy()
reddish = cat[:, :, 0] > 160
red_cat[reddish] = [255, 0, 0]
plt.imshow(red_cat)

<matplotlib.image.AxesImage at 0x7f0d44989cc0>
在这里插入图片描述

# Change RGB color to BGR for openCV
BGR_cat = cat[:, :, ::-1]
plt.imshow(BGR_cat)

<matplotlib.image.AxesImage at 0x7f0d434e05c0>
在这里插入图片描述

3. 转换图像数据类型

img_as_float Convert to 64-bit floating point.
img_as_ubyte Convert to 8-bit uint.
img_as_uint Convert to 16-bit uint.
img_as_int Convert to 16-bit int.

from skimage import img_as_float, img_as_ubyte
float_cat = img_as_float(cat)
uint_cat = img_as_ubyte(float_cat)

img = data.coffee()
plt.hist(img.ravel(), bins=256, histtype=‘step’, color=‘black’);

4. 显示图像直方图

img = data.camera()
plt.hist(img.ravel(), bins=256, histtype='step', color='black');

在这里插入图片描述

5. 图像分割

# Use colony image for segmentation
colony = io.imread('bpic11895.jpg')

# Plot histogram
img = skimage.color.rgb2gray(colony)
plt.hist(img.ravel(), bins=256, histtype='step', color='black');

在这里插入图片描述

# Use thresholding
plt.imshow(img>0.5)

<matplotlib.image.AxesImage at 0x7f0d43264a58>
在这里插入图片描述

6. Canny 算子用于边缘检测

from skimage.feature import canny
from scipy import ndimage as ndi
img_edges = canny(img)
img_filled = ndi.binary_fill_holes(img_edges)

# Plot
plt.figure(figsize=(18, 12))
plt.subplot(121)
plt.imshow(img_edges, 'gray')
plt.subplot(122)
plt.imshow(img_filled, 'gray')

在这里插入图片描述