pytorch14天学习打卡

最新推荐文章于 2022-08-03 01:31:55 发布

白白的雷

最新推荐文章于 2022-08-03 01:31:55 发布

阅读量164

点赞数

分类专栏： pytorch

本文链接：https://blog.csdn.net/qq_40316360/article/details/104314102

版权

pytorch 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

day1
1.线性回归
（1）基本要素：模型：y=w*x+b
数据集：测试集和训练集
损失函数：损失函数的计算方法
优化函数：随机梯度下降，即对参数进行多次迭代，使每次迭代都能降低损失函数的值。

（2）使用pytorch实现

import torch
from torch import nn
import numpy as np
torch.manual_seed(1)

print(torch.__version__)
torch.set_default_tensor_type('torch.FloatTensor')

num_inputs = 2
num_examples = 1000

true_w = [2, -3.4]
true_b = 4.2

features = torch.tensor(np.random.normal(0, 1, (num_examples, num_inputs)), dtype=torch.float)
labels = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_b
labels += torch.tensor(np.random.normal(0, 0.01, size=labels.size()), dtype=torch.float)

import torch.utils.data as Data

batch_size = 10

# combine featues and labels of dataset
dataset = Data.TensorDataset(features, labels)

# put dataset into DataLoader
data_iter = Data.DataLoader(
    dataset=dataset,            # torch TensorDataset format
    batch_size=batch_size,      # mini batch size
    shuffle=True,               # whether shuffle the data or not
    num_workers=2,              # read data in multithreading
)

for X, y in data_iter:
    print(X, '\n', y)
    break

class LinearNet(nn.Module):
    def __init__(self, n_feature):
        super(LinearNet, self).__init__()      # call father function to init 
        self.linear = nn.Linear(n_feature, 1)  # function prototype: `torch.nn.Linear(in_features, out_features, bias=True)`

    def forward(self, x):
        y = self.linear(x)
        return y
    
net = LinearNet(num_inputs)
print(net)

# ways to init a multilayer network
# method one
net = nn.Sequential(
    nn.Linear(num_inputs, 1)
    # other layers can be added here
    )

# method two
net = nn.Sequential()
net.add_module('linear', nn.Linear(num_inputs, 1))
# net.add_module ......

# method three
from collections import OrderedDict
net = nn.Sequential(OrderedDict([
          ('linear', nn.Linear(num_inputs, 1))
          # ......
        ]))

print(net)
print(net[0])

from torch.nn import init

init.normal_(net[0].weight, mean=0.0, std=0.01)
init.constant_(net[0].bias, val=0.0)  # or you can use `net[0].bias.data.fill_(0)` to modify it directly

for param in net.parameters():
    print(param)
    
loss = nn.MSELoss()    # nn built-in squared loss function
                       # function prototype: `torch.nn.MSELoss(size_average=None, reduce=None, reduction='mean')`
                       
import torch.optim as optim

optimizer = optim.SGD(net.parameters(), lr=0.03)   # built-in random gradient descent function
print(optimizer)  # function prototype: `torch.optim.SGD(params, lr=, momentum=0, dampening=0, weight_decay=0, nesterov=False)`

num_epochs = 3
for epoch in range(1, num_epochs + 1):
    for X, y in data_iter:
        output = net(X)
        l = loss(output, y.view(-1, 1))
        optimizer.zero_grad() # reset gradient, equal to net.zero_grad()
        l.backward()
        optimizer.step()
    print('epoch %d, loss: %f' % (epoch, l.item()))
    
# result comparision
dense = net[0]
print(true_w, dense.weight.data)
print(true_b, dense.bias.data)

2.softmax与分类模型
（1）交叉熵损失函数
在这里插入图片描述
（2）softmax：将值最大的输出所对应的类作为预测输出。输出的值在（0，1）之间表示概率。

3.多层感知机（MLP):即一个神经网络中有输入层，输出层和隐藏层。其本质上还是等价于一个单层神经网络。
激活函数：引入非线性变换，作为下一个全连接层的输入。这个非线性函数被称为激活函数。
常见的激活函数：relu(x)=max(x,0) 即只保留正数，并将负数清零。
sigmoid函数：元素值在（0，1）在这里插入图片描述
tanh函数：元素值在（0，1）

day2
1.文本预处理
（1）读入文本

import collections
import re

def read_time_machine():
    with open('/home/kesci/input/timemachine7163/timemachine.txt', 'r') as f:
        lines = [re.sub('[^a-z]+', ' ', line.strip().lower()) for line in f]
    return lines


lines = read_time_machine()
print('# sentences %d' % len(lines))

（2）分词

def tokenize(sentences, token='word'):
    """Split sentences into word or char tokens"""
    if token == 'word':
        return [sentence.split(' ') for sentence in sentences]
    elif token == 'char':
        return [list(sentence) for sentence in sentences]
    else:
        print('ERROR: unkown token type '+token)

tokens = tokenize(lines)
tokens[0:2]

（3）建立字典

class Vocab(object):
    def __init__(self, tokens, min_freq=0, use_special_tokens=False):
        counter = count_corpus(tokens)  # : 
        self.token_freqs = list(counter.items())
        self.idx_to_token = []
        if use_special_tokens:
            # padding, begin of sentence, end of sentence, unknown
            self.pad, self.bos, self.eos, self.unk = (0, 1, 2, 3)
            self.idx_to_token += ['', '', '', '']
        else:
            self.unk = 0
            self.idx_to_token += ['']
        self.idx_to_token += [token for token, freq in self.token_freqs
                        if freq >= min_freq and token not in self.idx_to_token]
        self.token_to_idx = dict()
        for idx, token in enumerate(self.idx_to_token):
            self.token_to_idx[token] = idx

    def __len__(self):
        return len(self.idx_to_token)

    def __getitem__(self, tokens):
        if not isinstance(tokens, (list, tuple)):
            return self.token_to_idx.get(tokens, self.unk)
        return [self.__getitem__(token) for token in tokens]

    def to_tokens(self, indices):
        if not isinstance(indices, (list, tuple)):
            return self.idx_to_token[indices]
        return [self.idx_to_token[index] for index in indices]

def count_corpus(sentences):
    tokens = [tk for st in sentences for tk in st]
    return collections.Counter(tokens)  # 返回一个字典，记录每个词的出现次数

（4）将文本从词的序列转换为索引的序列

for i in range(8, 10):
    print('words:', tokens[i])
    print('indices:', vocab[tokens[i]])

2.语言模型
（1）定义：评估一个序列是否合理。语言模型的参数就是词的概率以及给定前几个词情况下的条件概率。
在这里插入图片描述
（2）n元语法：

（3）随机采样：每个样本是原始序列上任意截取的一段序列，相邻的两个随机小批量在原始序列上的位置不一定相毗邻。
相邻采样：相邻的两个随机小批量在原始序列上的位置相毗邻。
3.循环神经网络基础
（1）构造：在这里插入图片描述
（2）one-hot向量：若字符的索引是 i ，则该向量的第 i 个位置为 1 ，其他位置为 0 。
（3）剪裁梯度：循环神经网络中较容易出现梯度衰减或梯度爆炸，这会导致网络几乎无法训练。裁剪梯度（clip gradient）是一种应对梯度爆炸的方法。假设我们把所有模型参数的梯度拼接成一个向量 g ，并设裁剪的阈值是 θ 。裁剪后的梯度的 L2 范数不超过 θ 。在这里插入图片描述
（4）困惑度：用于评价模型的好坏。困惑度是对交叉熵损失函数做指数运算后得到的值。特别地，

最佳情况下，模型总是把标签类别的概率预测为1，此时困惑度为1；
最坏情况下，模型总是把标签类别的概率预测为0，此时困惑度为正无穷；
基线情况下，模型总是预测所有类别的概率都相同，此时困惑度为类别个数。
显然，任何一个有效模型的困惑度必须小于类别个数。

白白的雷

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pytorch14天学习打卡

day11.线性回归（1）基本要素：模型：y=w*x+b数据集：测试集和训练集损失函数：优化函数：随机梯度下降，即对参数进行多次迭代，使每次迭代都能降低损失函数的值。（2）使用pytorch实现import torchfrom torch import nnimport numpy as nptorch.manual_seed(1)print(torch.__versio...
复制链接

扫一扫

专栏目录