pytorch整理（三）

最新推荐文章于 2023-08-29 21:16:58 发布

LV little white

最新推荐文章于 2023-08-29 21:16:58 发布

阅读量229

点赞数

文章标签： pytorch

本文链接：https://blog.csdn.net/qq_35446336/article/details/105010822

版权

数据处理

scikit-image：用于图像io和变换
pandas：为了更方便地处理csv文件

忽略警告信息

import warnings
warnings.filterwarnings('ignore')

plt

使用matplotlib的过程中，常常会需要画很多图，但是好像并不能同时展示许多图
plt.show()之后，程序会暂停到那儿，并不会继续执行下去。如果需要继续执行程序，就要关闭图片
那如何展示动态图或多个窗口呢？这就要使用plt.ion()这个函数，使matplotlib的显示模式转换为交互（interactive）模式。即使在脚本中遇到plt.show()，代码还是会继续执行。

import matplotlib.pyplot as plt    
plt.ion()    # 打开交互模式    
# 同时打开两个窗口显示图片    
plt.figure()  #图片一    
plt.imshow(i1)    
plt.figure()    #图片二    
plt.imshow(i2)    
# 显示前关掉交互模式    
plt.ioff()    
plt.show()

在交互模式下：1、plt.plot(x)或plt.imshow(x)是直接出图像，不需要plt.show()2、如果在脚本中使用ion()命令开启了交互模式，没有使用ioff()关闭的话，则图像会一闪而过，并不会常留。要想防止这种情况，需要在plt.show()之前加上ioff()命令。在阻塞模式下：1、打开一个窗口以后必须关掉才能打开下一个新的窗口。这种情况下，默认是不能像Matlab一样同时开很多窗口进行对比的。2、plt.plot(x)或plt.imshow(x)是直接出图像，需要plt.show()后才能显示图像

pd

当用行号索引的时候, 尽量用 iloc 来进行索引; 而用标签索引的时候用 loc , ix 尽量别用。

image_name,part_0_x,part_0_y,part_1_x,part_1_y,part_2_x, ... ,part_67_x,part_67_y
0805personali01.jpg,27,83,27,98, ... 84,134
1084239450_e76e00b7e7.jpg,70,236,71,257, ... ,128,312

landmarks_frame = pd.read_csv('data/faces/face_landmarks.csv')

n = 65
img_name = landmarks_frame.iloc[n, 0]# 65行0列
landmarks = landmarks_frame.iloc[n, 1:].as_matrix()
landmarks = landmarks.astype('float').reshape(-1, 2)

print('Image name: {}'.format(img_name))
print('Landmarks shape: {}'.format(landmarks.shape))
print('First 4 Landmarks: {}'.format(landmarks[:4]))

Image name: person-7.jpg
Landmarks shape: (68, 2)
First 4 Landmarks: [[32. 65.]
 [33. 76.]
 [34. 86.]
 [34. 97.]]

数据集类

torch.utils.data.Dataset是表示数据集的抽象类。您的自定义数据集应继承Dataset，并覆盖下列方法：
len，使得len(dataset)返回数据集的大小。
__getitem__支持索引，使得dataset[i]可以用来获取第i个样本

transform可改写__call__方法改变操作
torch.utils.data.DataLoader是数据集迭代器，collate_fn可定义取数据方式
torchvision中可用的更通用的数据集之一是ImageFolder

例子

numpy实现网络

# -*- coding: utf-8 -*-
import numpy as np

# N是批尺寸参数；D_in是输入维度
# H是隐藏层维度；D_out是输出维度
N, D_in, H, D_out = 64, 1000, 100, 10

# 产生随机输入和输出数据
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)

# 随机初始化权重
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)

learning_rate = 1e-6
for t in range(500):
    # 前向传播：计算预测值y
    h = x.dot(w1)
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)

    # 计算并显示loss(损失）
    loss = np.square(y_pred - y).sum()
    print(t, loss)

    # 反向传播，计算w1、w2对loss的梯度
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h < 0] = 0
    grad_w1 = x.T.dot(grad_h)

    # 更新权重
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

自动求导

我们可以通过建立torch.autograd的子类来实现我们自定义的autograd函数，并完成张量的正向和反向传播。

和TF区别

动态图框架对应的是命令式编程
静态图框架对应的是符号式编程

命令式：运行时计算，大部分python代码都是imperative

  import numpy as np
  a = np.ones(10)
  b = np.ones(10) * 2
  c = b * a
  d = c + 1

当程序执行到 c=b∗ac=b∗a c=b∗ac=b∗a 时，代码开始做对应的数值计算。

符号式：这类函数的定义中使用数值占位符，当给定真正的输入后，才会对这个函数进行编译计算。

    A = Variable('A')
    B = Variable('B')
    C = B * A
    D = C + Constant(1)
    # compiles the function
    f = compile(D)
    d = f(A=np.ones(10), B=np.ones(10)*2)

命令式编程更加灵活
符号式编程更加高效

TorchScript

TorchScript 是 Python 的静态可分析和可优化的子集，Torch 使用它以不依赖于 Python 而运行深度学习程序。

TensorBoard

看图片

from torch.utils.tensorboard import SummaryWriter

# default `log_dir`is "runs" - we'll be more specific here
writer = SummaryWriter('runs/fashion_mnist_experiment_1')

# write to tensorboard
writer.add_image('four_fashion_mnist_images', img_grid)

命令行

tensorboard --logdir=runs

然后导航到https://localhost:6006/

看网络

writer.add_graph(net, images)
writer.close()

现在刷新TensorBoard后，您应该会看到一个“ Graphs”标签
继续并双击 “Net” 以展开它，查看组成模型的各个操作的详细视图。

看数据

我们可以通过 add_embedding 方法可视化高维数据的低维表示

# helper function
def select_n_random(data, labels, n=100):
    '''
    Selects n random datapoints and their corresponding labels from a dataset
    '''
    assert len(data) == len(labels)

    perm = torch.randperm(len(data))
    return data[perm][:n], labels[perm][:n]

# select random images and their target indices
images, labels = select_n_random(trainset.data, trainset.targets)

# get the class labels for each image
class_labels = [classes[lab] for lab in labels]

# log embeddings
features = images.view(-1, 28 * 28)
writer.add_embedding(features,
                    metadata=class_labels,
                    label_img=images.unsqueeze(1))
writer.close()

看损失

 writer.add_scalar('training loss',
                            running_loss / 1000,
                            epoch * len(trainloader) + i)

plt 图

 writer.add_figure('predictions vs. actuals',
                   plot_classes_preds(net, inputs, labels),
                   global_step=epoch * len(trainloader) + i)

看指标

一个“ PR Curves”选项卡，其中包含每个类别的精确调用曲线。继续戳一下；您会看到，在某些类别上，模型的“曲线下面积”接近100％，而在另一些类别上，该面积更低

LV little white

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
pytorch整理（三）

数据处理scikit-image：用于图像io和变换pandas：为了更方便地处理csv文件忽略警告信息import warningswarnings.filterwarnings('ignore')plt使用matplotlib的过程中，常常会需要画很多图，但是好像并不能同时展示许多图plt.show()之后，程序会暂停到那儿，并不会继续执行下去。如果需要继续执行程序，就要关闭图...
复制链接

扫一扫