（五）fastai应用

_helen_520

已于 2022-10-18 15:57:19 修改

阅读量684

点赞数 1

分类专栏： fastai学习笔记文章标签：深度学习 python pytorch

于 2022-07-28 19:15:48 首次发布

本文链接：https://blog.csdn.net/haronchou/article/details/126031631

版权

fastai学习笔记专栏收录该内容

20 篇文章 7 订阅

订阅专栏

目前：fastai lesson8~lesson11的部分都重构了

mnist数据集比较简单、28×28的像素，都是一样的。背景也比较干净，同时是分类任务，用简单的网络就可以处理的。
- 由于数据集过于简单，没办法看到一些基本操作的效果，改为后面的Imagenette的数据集

0. mnist数据集调试

0.1 一层线性层Linear()，1个epoch，梯度下降

可以看出权重就是个0，很神奇。线性层开始训练出来的时候，权重是个0的样子。784*10=7840个参数
2个线性层，784*50， 50*10，将两个权重层乘积起来。多个线性层的叠加，784*50+ 50*10=39700个参数。
一个线性层和两个线性层，差别不大。本质上是一样的。
但Adam加快了收敛，最后的权重图和上面的不太一样！

同样的网络，训练改为1cycle之后的样子。

随着训练次数增加，权重会进一步变化，便宜数字的形态越来越远。

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai.vision import *
import warnings
warnings.filterwarnings("ignore")

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2" 

from fastai.basics import *
path = Path('/home/gdyanfa1/zhouhairong_py/fastai_dataset')

with gzip.open(path/'mnist.pkl.gz','rb') as f:
    ((x_train, y_train), (x_valid, y_valid), _) = pickle.load(f, encoding='latin-1')

x_train,y_train,x_valid,y_valid = map(torch.tensor, (x_train,y_train,x_valid,y_valid))
ba = 128
train_ds = TensorDataset(x_train, y_train)
valid_ds = TensorDataset(x_valid, y_valid)
data = DataBunch.create(train_ds, valid_ds)

class Mnist_Logistic(nn.Module):
    def __init__(self):
        super().__init__()
        self.lin = nn.Linear(784,10,bias=True)
    
    def forward(self, xb):
        return self.lin(xb)
# 使用类建立一个对象，并放在Cuda上
model = Mnist_Logistic().cuda()

lr=2e-2
loss_func = nn.CrossEntropyLoss()

def update(x,y,lr):
    wd = 1e-5
    y_hat = model(x) # 为什么这个就会去调用forward?应该是nn.Module的设置
    w2 = 0.
    for p in model.parameters():
        w2 += (p**2).sum()
    loss = loss_func(y_hat, y) + w2*wd
    # loss里面加入了w2*wd，那么在grad中就会自动加入这部分的计算了。
    loss.backward()
    with torch.no_grad():
        for p in model.parameters():
            p.sub_(lr*p.grad)
            p.grad.zero_()

    return loss.item()

losses = [update(x,y,lr) for x,y in data.train_dl]
plt.plot(losses);

losses = [update(x,y,lr) for x,y in data.train_dl]
t = model.lin.weight.detach().cpu()
import matplotlib.pyplot as plt
plt.imshow(t[0,:].view(28,28))


# 采用一层线性层，看看效果
learn = Learner(data, Mnist_Logistic(), loss_func=loss_func, metrics=accuracy)
learn.fit_one_cycle(1, 1e-2)

t = learn.model.lin.weight.detach().cpu()
import matplotlib.pyplot as plt
fig, axes = plt.subplots(3,3)
axes_list = []
for i in range(axes.shape[0]):
    for j in range(axes.shape[1]):
        axes_list.append(axes[i,j])
i = 0
for ax in axes_list:
    ax.imshow(t[i,:].view(28,28))
    ax.set_title(i)
    i = i+1

1. pets.ipynb调试

https://nbviewer.org/github/fastai/course-v3/blob/master/nbs/dl1/lesson1-pets.ipynb
gitee上调试自己的
fastai1调试本地的库，可以更加清楚的看到

训练集、验证集，80%，20%，固定的随机性。

图像处理的变换：数据增强有哪些？
- crop_pad随机裁剪，同时缝隙部分用reflection进行填充
- 水平镜像翻转 flip_lr
- wrap是透视变换
- 旋转
- 缩放
- 对比度拉升
- 亮度提升

0.1 resnet34预训练模型

最开始哪些层是可以训练的？——BN层，why?
- 最后两个Linear层和所有的BN层是可以训练的！why?
- 最后两个线性层是分类器：512*37+37=18981个参数。
冻结参数层：learn.freeze()------基本上是freeze(-1)，即只训练最后的custom_head层。只训练yolo层或识别层。
- 所有的BN层不冻结，freeze()只冻结卷积层。freeze()也不冻结最后一层。

Sequential
======================================================================
Layer (type)         Output Shape         Param #    Trainable 
======================================================================
Conv2d               [64, 112, 112]       9,408      False     
______________________________________________________________________
BatchNorm2d          [64, 112, 112]       128        True      
______________________________________________________________________
ReLU                 [64, 112, 112]       0          False     
______________________________________________________________________
MaxPool2d            [64, 56, 56]         0          False     
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
ReLU                 [64, 56, 56]         0          False     
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
ReLU                 [64, 56, 56]         0          False     
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
ReLU                 [64, 56, 56]         0          False     
______________________________________________________________________
Conv2d               [64, 56, 56]         36,864     False     
______________________________________________________________________
BatchNorm2d          [64, 56, 56]         128        True      
______________________________________________________________________
Conv2d               [128, 28, 28]        73,728     False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
ReLU                 [128, 28, 28]        0          False     
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
Conv2d               [128, 28, 28]        8,192      False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
ReLU                 [128, 28, 28]        0          False     
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
ReLU                 [128, 28, 28]        0          False     
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
ReLU                 [128, 28, 28]        0          False     
______________________________________________________________________
Conv2d               [128, 28, 28]        147,456    False     
______________________________________________________________________
BatchNorm2d          [128, 28, 28]        256        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        294,912    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        32,768     False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
ReLU                 [256, 14, 14]        0          False     
______________________________________________________________________
Conv2d               [256, 14, 14]        589,824    False     
______________________________________________________________________
BatchNorm2d          [256, 14, 14]        512        True      
______________________________________________________________________
Conv2d               [512, 7, 7]          1,179,648  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
ReLU                 [512, 7, 7]          0          False     
______________________________________________________________________
Conv2d               [512, 7, 7]          2,359,296  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
Conv2d               [512, 7, 7]          131,072    False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
Conv2d               [512, 7, 7]          2,359,296  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
ReLU                 [512, 7, 7]          0          False     
______________________________________________________________________
Conv2d               [512, 7, 7]          2,359,296  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
Conv2d               [512, 7, 7]          2,359,296  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
ReLU                 [512, 7, 7]          0          False     
______________________________________________________________________
Conv2d               [512, 7, 7]          2,359,296  False     
______________________________________________________________________
BatchNorm2d          [512, 7, 7]          1,024      True      
______________________________________________________________________
AdaptiveAvgPool2d    [512, 1, 1]          0          False     
______________________________________________________________________
AdaptiveMaxPool2d    [512, 1, 1]          0          False     
______________________________________________________________________
Flatten              [1024]               0          False     
______________________________________________________________________
BatchNorm1d          [1024]               2,048      True      
______________________________________________________________________
Dropout              [1024]               0          False     
______________________________________________________________________
Linear               [512]                524,800    True      
______________________________________________________________________
ReLU                 [512]                0          False     
______________________________________________________________________
BatchNorm1d          [512]                1,024      True      
______________________________________________________________________
Dropout              [512]                0          False     
______________________________________________________________________
Linear               [37]                 18,981     True      
______________________________________________________________________

Total params: 21,831,525
Total trainable params: 563,877
Total non-trainable params: 21,267,648
Optimized with 'torch.optim.adam.Adam', betas=(0.9, 0.99)
Using true weight decay as discussed in https://www.fast.ai/2018/07/02/adam-weight-decay/ 
Loss function : FlattenedLoss
======================================================================
Callbacks functions applied

0.2 训练方法

先冻结网络，只训练最后一个custom_head，线性层，即分类器或Yolo层。
然后解冻网络，连着前面的卷积层一起训练。变差了
unfreeze()网络，重新找一个学习率
再次训练，又降低下去了？应该是前层、后层，使用不同的学习率进行训练。前面层的学习率不能太高，后面层的学习率可以高一些
- learn.unfreeze()
  learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2" #此时显示4块显卡

import sys
sys.path.insert(0,'/home/gdyanfa1/zhouhairong_py/course-v3/nbs/dl1/fastai1')

from fastai.vision import *
from fastai.metrics import error_rate
from fastai import *

import warnings
warnings.filterwarnings("ignore")

bs = 64
path = Path('/home/gdyanfa1/zhouhairong_py/fastai_dataset/oxford-iiit-pet')

path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)

np.random.seed(2)
pat = r'/([^/]+)_\d+.jpg$'
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)


learn = cnn_learner(data, models.resnet34, metrics=error_rate)
learn.fit_one_cycle(4)
learn.save('stage-1')

interp = ClassificationInterpretation.from_learner(learn)

losses,idxs = interp.top_losses()

len(data.valid_ds)==len(losses)==len(idxs)

interp.most_confused(min_val=2)

learn.unfreeze()
learn.fit_one_cycle(1)
learn.load('stage-1')
learn.lr_find()
learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))

1. 使用线性模型

pytorch中的参数初始化方法总结_ys1305的博客-CSDN博客_reset_parameters

model = nn.Sequential(nn.Linear(), nn.ReLU(), nn.Linear())

pytorch的默认初始化，在各个层的reset_parameters()方法中。

# 在这里对mnist数据集进行分类处理，实现acc的提升
from exp.nb_09c import *
""" 0.数据准备
        没有用自己写的DataBunch，ItemList等接口。ImageList的get是要去open的
    mnist走的还是pytorch的Dataloader的接口
"""

x_train,y_train,x_valid,y_valid = get_data() # 这个函数在nb_02.py中定义
x_train,x_valid = normalize_to(x_train,x_valid)  # nb_05.py中
n,m = x_train.shape
c = y_train.max().item() + 1
bs = 512

# 使用Dataset来管理batch数据： nb_03.py
train_ds,valid_ds = Dataset(x_train, y_train),Dataset(x_valid, y_valid)
# nb_08.py  get_dls在nb_03.py，使用的是Dataloader
data = DataBunch(*get_dls(train_ds, valid_ds, bs), c)

loss_func = F.cross_entropy

""" 1. 线性模型（50，10），使用pytorch的nn.Module基类，不重构了
"""
nh = 50

def init_linear_(m, f):
    if isinstance(m, nn.Linear):
        f(m.weight, a=0.1)
        if getattr(m, 'bias', None) is not None: m.bias.data.zero_()
    for l in m.children(): init_linear_(l, f)

def init_linear(m, uniform=False):
    f = init.kaiming_uniform_ if uniform else init.kaiming_normal_
    init_linear_(m, f)

# ① model，由于是自定义的线性模型，没有初始化
model = nn.Sequential(nn.Linear(m, nh), nn.ReLU(), nn.Linear(nh, c))

lr = 0.5
# get_runner nb_06.py 由于不是CNN网络，所以不是get_cnn_runner
# 使用get_runner而不是get_learner
# device = torch.device('cuda', 0)
# torch.cuda.set_device(device)
cbfs = [partial(AvgStatsCallback, accuracy), CudaCallback, Recorder, ProgressCallback]

phases = combine_scheds([0.3, 0.7], cos_1cycle_anneal(0.2, 0.6, 0.2))
sched = ParamScheduler('lr', phases)

# Learner在nb_09b.py   线性模型、交叉熵loss、lr、cbfs、opt  在Learner.fit中有opt的初始化函数的。
# ② 优化器 nb_09b.py 简单的sgd梯度下降，weight_decay是l2正则化
learn = Learner(model=model, data=data, loss_func=loss_func, lr=lr, cb_funcs=cbfs)

# 可以在fit的时候添加一个cbs

# sgd: p = p - lr*p.grad 
# weight_decay: p = p * ( 1 - lr*wd)
def append_stats(hook, mod, inp, outp):
    if not hasattr(hook,'stats'): hook.stats = ([],[],[])
    means,stds,hists = hook.stats
    means.append(outp.data.mean().cpu()) # 激活元的值
    stds .append(outp.data.std().cpu())
    hists.append(outp.data.cpu().histc(40,0,10)) #histc isn't implemented on the GPU

def get_hist(h): 
    return torch.stack(h.stats[2]).t().float().log1p()  # h.stats[2]为直方图

with Hooks(model, append_stats) as hooks:
    learn.fit(1)    # pytorch_init + sgd
    fig, [ax0, ax1] = plt.subplots(1,2, figsize=(10,4))
    for h in hooks:
        ms, ss, hi = h.stats
        ax0.plot(ms), ax0.set_title("act_means", loc='center'), ax0.set_xlabel('batches')
        ax0.legend(range(3))
        ax1.plot(ss), ax1.set_title("act_stds", loc='center'), ax1.set_xlabel('batches')
        ax1.legend(range(3))

fig,axes = plt.subplots(2,2, figsize=(15,6))
for ax,h in zip(axes.flatten(), hooks[:3]):
    ax.imshow(get_hist(h), origin='lower'), ax.set_title("acts_hist", loc='center'), ax.set_xlabel('activiations')
    ax.axis('off')
plt.tight_layout()

def get_min(h): # 将直方图的前两个数加起来
    h1 = torch.stack(h.stats[2]).t().float()
    return h1[:2].sum(0)/h1.sum(0)

fig,axes = plt.subplots(2,2, figsize=(15,6))
for ax,h in zip(axes.flatten(), hooks[:3]):
    ax.plot(get_min(h)), ax.set_title("hist[:2] zero ratio", loc='center'), plt.xlabel('batches')
    ax.set_ylim(0,1)
plt.tight_layout()