pytorch基本操作2-CSDN博客

本文链接：https://blog.csdn.net/xihuanniNI/article/details/147333216

torch.clamp

主要用于对张量中的元素进行截断（clamping），将其限制在一个指定的区间范围内。

函数定义

torch.clamp(input, min=None, max=None) → Tensor

参数说明

input
类型：Tensor
需要进行截断操作的输入张量。

min
类型：float 或 None（默认值）
指定张量中元素的最小值。小于 min 的元素会被截断为 min 值。
如果设置为 None，则表示不限制最小值。

max
类型：float 或 None（默认值）
指定张量中元素的最大值。大于 max 的元素会被截断为 max 值。
如果设置为 None，则表示不限制最大值。

返回值

返回一个新的张量，其中元素已经被限制在 ([min, max]) 的范围内。
原张量不会被修改（函数是非原地操作），除非使用 torch.clamp_ 以进行原地操作。

使用场景

torch.clamp 常见的应用场景包括：

避免数值溢出：限制张量中的数值在合理范围内，防止出现过大或过小的值导致数值不稳定。
归一化操作：将张量值限制在 [0, 1] 或 [-1, 1] 的范围。
梯度截断：在训练神经网络时避免梯度爆炸或梯度消失问题。

demo

仅设置最大值或最小值

x = torch.tensor([0.5, 2.0, -1.0, 3.0, -2.0])

# 仅限制最大值为 1.0
result_max = torch.clamp(x, max=1.0)

# 仅限制最小值为 0.0
result_min = torch.clamp(x, min=0.0)

print(result_max)  # 输出: tensor([0.5000, 1.0000, -1.0000, 1.0000, -2.0000])
print(result_min)  # 输出: tensor([0.5000, 2.0000, 0.0000, 3.0000, 0.0000])

torch.nonzero

返回 input中非零元素的索引下标

无论输入是几维，输出张量都是两维，每行代表输入张量中非零元素的索引位置(在所有维度上面的位置)，行数表示非零元素的个数。

2维度

import torch
import torch.nn.functional as F
import math


X=torch.tensor([[0, 2, 0,  4],
                [5, 0, 7,  8]])
index = torch.nonzero(X)
# index = X.nonzero()
print(index)

id_t = index.t_()
print(id_t)

tensor([[0, 1],
[0, 3],
[1, 0],
[1, 2],
[1, 3]])
tensor([[0, 0, 1, 1, 1],
[1, 3, 0, 2, 3]])

3维度

返回一个二维张量，其中每一行代表一个非零元素的索引。

# 创建一个三维张量
y = torch.tensor([[[0, 1], [0, 2]],
                  [[3, 0], [4, 0]],
                  [[5, 0], [0, 6]]])

# 找到非零元素的索引
indices_3d = torch.nonzero(y)

print(indices_3d)

tensor([[0, 0, 1],
[0, 1, 1],
[1, 0, 0],
[1, 1, 0],
[2, 0, 0],
[2, 1, 1]])

torch.sparse_coo_tensor

torch.sparse_coo_tensor(indices, values, size=None, *, dtype=None, device=None, requires_grad=False, check_invariants=None, is_coalesced=None)

创建一个Coordinate(COO) 格式的稀疏矩阵.

稀疏矩阵指矩阵中的大多数元素的值都为0，由于其中非常多的元素都是0，使用常规方法进行存储非常的浪费空间，所以采用另外的方法存储稀疏矩阵。

》》

使用一个三元组来表示矩阵中的一个非0数，三元组分别表示元素（所在行，所在列，元素值），也就是上图中每一个竖的三元组就是表示了一个非零数，其余值都为0，这样就存储了一个稀疏矩阵.构造这样个矩阵需要知道所有非零数所在的行、所在的列、非零元素的值和矩阵的大小这四个值。

        indices：此参数是指定非零元素所在的位置，也就是行和列，所以此参数应该是一个二维的数组，当然它可以是很多格式（list, tuple, NumPy ndarray, scalar, and other types. ）第一维指定了所有非零数所在的行数，第二维指定了所有非零元素所在的列数。例如indices=[[1, 4, 6], [3, 6, 7]]表示我们稀疏矩阵中(1, 3),(4, 6), (6, 7)几个位置是非零的数所在的位置。
        values：此参数指定了非零元素的值，所以此矩阵长度应该和上面的indices一样长也可以是很多格式（list, tuple, NumPy ndarray, scalar, and other types.）。例如values=[1, 4, 5]表示上面的三个位置非零数分别为1, 4, 5。
        size：指定了稀疏矩阵的大小，例如size=[10, 10]表示矩阵大小为 10 × 10 ，此大小最小应该足以覆盖上面非零元素所在的位置，如果不给定此值，那么默认是生成足以覆盖所有非零值的最小矩阵大小。
        dtype：指定返回tensor中数据的类型，如果不指定，那么采取values中数据的类型。
        device：指定创建的tensor在cpu还是cuda上。
        requires_grad：指定创建的tensor需不需要梯度信息，默认为False

demo1

import torch

indices = torch.tensor([[4, 2, 1], [2, 0, 2]])
values = torch.tensor([3, 4, 5], dtype=torch.float32)
x = torch.sparse_coo_tensor(indices=indices, values=values, size=[5, 5])
print(x)
print(x.shape)
print('-----------------------------')
print(x.to_dense())

tensor(indices=tensor([[4, 2, 1],
[2, 0, 2]]),
values=tensor([3., 4., 5.]),
size=(5, 5), nnz=3, layout=torch.sparse_coo)

torch.Size([5, 5])
-----------------------------
tensor([[0., 0., 0., 0., 0.],
[0., 0., 5., 0., 0.],
[4., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 3., 0., 0.]])

demo2

未指定size，输出的大小是根据索引indices每个维度上最大的数字决定

[4,2,1] 最大下标是4，因为是下标从0开始，所以size=5；同样[2,0,2]，size=3

import torch

indices = torch.tensor([[4, 2, 1], [2, 0, 2]])
values = torch.tensor([3, 4, 5], dtype=torch.float32)
x = torch.sparse_coo_tensor(indices=indices, values=values)
print(x)
print(x.shape)
print('-----------------------------')
print(x.to_dense())

tensor(indices=tensor([[4, 2, 1],
[2, 0, 2]]),
values=tensor([3., 4., 5.]),
size=(5, 3), nnz=3, layout=torch.sparse_coo)
torch.Size([5, 3])
-----------------------------
tensor([[0., 0., 0.],
[0., 0., 5.],
[4., 0., 0.],
[0., 0., 0.],
[0., 0., 3.]])

demo3

indices and values must have same nnz。indices nnz = indices行方向的维度；values nnz=values 行的维度

如果指定size,size每个维度数值大小要 >= indices values的维度。

比如dim=0，indices行下标[0,1,2]，维度至少3；dim=1，indices列下标[1,2，3]，维度至少4；dim=2，values列维度是2。所以Size>=[3, 4, 2]

如果未指定size，按照 [indices行下标维度，indices列下标维度，values列维度]

# indices：shape为[2,n]，有2行，分别是行下标向量和列下标向量

index = torch.Tensor([[0, 1, 2], [1, 2, 3]])  
value = torch.randn((3, 2))
x = torch.sparse_coo_tensor(indices=index, values=value)

print(x)
print(x.shape)
print(x.to_dense())

tensor(indices=tensor([[0, 1, 2],
[1, 2, 3]]),
values=tensor([[-0.9341, -0.5250],
[-0.0472, 0.6725],
[-1.3630, -0.4698]]),
size=(3, 4, 2), nnz=3, layout=torch.sparse_coo)
torch.Size([3, 4, 2])
tensor([[[ 0.0000, 0.0000],
[-0.9341, -0.5250],
[ 0.0000, 0.0000],
[ 0.0000, 0.0000]],

[[ 0.0000, 0.0000],
[ 0.0000, 0.0000],
[-0.0472, 0.6725],
[ 0.0000, 0.0000]],

[[ 0.0000, 0.0000],
[ 0.0000, 0.0000],
[ 0.0000, 0.0000],
[-1.3630, -0.4698]]])

demo4

index = torch.Tensor([[0, 1, 3], [1, 2, 3]])  
value = torch.randn((3, 4))
x = torch.sparse_coo_tensor(indices=index, values=value)

print(x)
print(x.shape)
print(x.to_dense())

tensor(indices=tensor([[0, 1, 3],
[1, 2, 3]]),
values=tensor([[ 2.3780, 0.7919, -1.5508, 0.6087],
[ 1.4725, 1.3381, -0.3287, 0.0247],
[-0.3950, 0.4286, -0.0651, -1.0453]]),
size=(4, 4, 4), nnz=3, layout=torch.sparse_coo)
torch.Size([4, 4, 4])
tensor([[[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 2.3780, 0.7919, -1.5508, 0.6087],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000]],

[[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 1.4725, 1.3381, -0.3287, 0.0247],
[ 0.0000, 0.0000, 0.0000, 0.0000]],

[[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000]],

[[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.0000, 0.0000, 0.0000, 0.0000],
[-0.3950, 0.4286, -0.0651, -1.0453]]])

torch.max

1个tensor

import torch
import torch.nn.functional as F
import math


# 创建一个二维张量
x = torch.tensor([[1, 3, 2], [4, 5, 6], [7, 8, 9]])

# 返回最大值
value = torch.max(x)
print("value:", value)

# 返回最大值及其索引
max_value, max_index = torch.max(x, 0)
print("Max value:", max_value)
print("Max index:", max_index)

# 在第一个维度（行）上查找最大值，并保留形状
max_value_row, max_index_row = torch.max(x, 0, keepdim=True)
print("dim=0 keeping shape: ", max_value_row)
print("dim=0 keeping shape: ", max_index_row)

value: tensor(9)
Max value: tensor([7, 8, 9])
Max index: tensor([2, 2, 2])
dim=0 keeping shape: tensor([[7, 8, 9]])
dim=0 keeping shape: tensor([[2, 2, 2]])

2个tensor iou

boxes1 的形状为 (boxes1的数量, 4)，每个边界框由4个值表示（例如 [x_min, y_min, x_max, y_max]）。
boxes2 的形状为 (boxes2的数量, 4)。
areas1 的形状为 (boxes1的数量,)，每个值表示对应边界框的面积。
areas2 的形状为 (boxes2的数量,)。

使用 torch.max 和 torch.min 来分别计算交集区域的左上角和右下角坐标。boxes1[:, None, :2] 会在 boxes1 的第一个维度上增加一个维度，从而使得 boxes1 的形状变为 (boxes1的数量, 1, 2)，使其与 boxes2 进行广播操作。

    area1 = (bboxes1[..., 2] - bboxes1[..., 0]) * (
        bboxes1[..., 3] - bboxes1[..., 1])
    area2 = (bboxes2[..., 2] - bboxes2[..., 0]) * (
        bboxes2[..., 3] - bboxes2[..., 1])

    lt = torch.max(bboxes1[:, None, :2], #bboxes1 [38,4], bboxes2 [80000,4], lt [38,80000,2]
                       bboxes2[None, :, :2])  # [rows, cols, 2]
    rb = torch.min(bboxes1[:, None, 2:],
                       bboxes2[None, :, 2:])  # [rows, cols, 2]

    wh = fp16_clamp(rb - lt, min=0)
    overlap = wh[..., 0] * wh[..., 1]
    union = area1[..., None] + area2[..., None, :] - overlap  #[38,80000]
    eps = union.new_tensor([eps])
    union = torch.max(union, eps)
    ious = overlap / union

torch.Tensor.new_full

根据现有Tensor创建新张量，保持一致的dtype、device, 大小尺寸不一定相同

new_full(size, fill_value, dtype=None, device=None, requires_grad=False) -> Tensor

参数详解

size: 新tensor大小

fill_value (scalar): 填充输出张量的数字。
dtype (:class:torch.dtype, optional): 所需的返回张量类型。默认：None,表示类型相同。
device (:class:torch.device, optional): 所需的返回张量的设备。默认：None,表示设备相同
requires_grad :返回的张量是否记录梯度信息。默认：False

src = torch.ones((2,), dtype=torch.float32)
dst=src.new_full((1, 4), 5)
print(dst.dtype, dst.device, dst.requires_grad)
print(dst)

torch.float32 cpu False
tensor([[5., 5., 5., 5.]])

torch.all

torch.all(input, dim=None, keepdim=False)

torch.all 用于判断张量中是否所有元素都为 True（或非零值）。如果是，则返回 True；否则返回 False。

返回值

如果未指定 dim，返回一个布尔值。
如果指定了 dim，返回一个沿指定维度的结果张量。

import torch
 
x = torch.tensor([[1, 2, 0], [3, 4, 5]])
 
# 检查所有元素是否都为非零（True）
print(torch.all(x))  # False
 
# 沿维度 1 检查
print(torch.all(x, dim=1))  # tensor([False,  True])
 
# 保留维度
print(torch.all(x, dim=1, keepdim=True))  # tensor([[False], [ True]])

torch.any

torch.any(input, dim=None, keepdim=False)

torch.any 用于判断张量中是否至少有一个元素为 True（或非零值）。如果是，则返回 True；否则返回 False。

返回值

如果未指定 dim，返回一个布尔值。
如果指定了 dim，返回一个沿指定维度的结果张量。

x = torch.tensor([[0, 0, 0], [1, 0, 0]])
 
# 检查是否存在非零（True）元素
print(torch.any(x))  # True
 
# 沿维度 1 检查
print(torch.any(x, dim=1))  # tensor([False,  True])
 
# 保留维度
print(torch.any(x, dim=1, keepdim=True))  # tensor([[False], [ True]])

torch.cat

torch.cat(tensors, dim=0) # dim 指定拼接的维度

作用：沿现有维度拼接多个张量，不创建新维度
输入要求：所有张量的形状必须除拼接维度外完全相同。
返回值：拼接后的张量在指定维度上的大小是输入张量该维度大小的总和，其他维度必须完全一致。

a = torch.tensor([[1, 2], [3, 4]])  # shape (2, 2)
b = torch.tensor([[5, 6]])           # shape (1, 2)
 
# 沿 dim=0 拼接（行方向）
c = torch.cat([a, b], dim=0)
print(c)
# tensor([[1, 2],
#         [3, 4],
#         [5, 6]])  # shape (3, 2)

torch.stack

作用：沿新维度堆叠多个张量，创建新维度。
输入要求：所有张量的形状必须完全相同。
返回值：输出张量比输入张量多一个维度。

a = torch.tensor([1, 2])  # shape (2,)
b = torch.tensor([3, 4])  # shape (2,)
 
# 沿新维度 dim=0 堆叠
c = torch.stack([a, b], dim=0)
print(c)
# tensor([[1, 2],
#         [3, 4]])  # shape (2, 2)
 
# 沿新维度 dim=1 堆叠
d = torch.stack([a, b], dim=1)
print(d)
# tensor([[1, 3],
#         [2, 4]])  # shape (2, 2)

None

None的作用主要是在使用None的位置新增一个维度

切片 ...

省略号的作用是省略前面所有的 ':' 索引操作

x = torch.randint(0,10,(2,3,4))
print(x)

print(x[...,:2]==x[:,:,:2])# 当x有3个维度时，这二者相等。

x[…, :2] # 前面的维度都取，最后一个维度只取前两个元素。

loss function

F.binary_cross_entropy

F.binary_cross_entropy(
input: Tensor, # 预测输入
target: Tensor, # 标签
weight: Optional[Tensor] = None, # 权重可选项
size_average: Optional[bool] = None, # 可选项，快被弃用了
reduce: Optional[bool] = None,
reduction: str = "mean", # 默认均值或求和等形式
) -> Tensor:

input: 形状为 (N, *) 的张量，表示模型的预测结果，取值范围可以是任意实数。
target: 形状为 (N, *) 的张量，表示真实标签，取值为 0 或 1。

import torch
import torch.nn.functional as F
import math


row, col = 2, 3
x = torch.randn(row, col)


predict = torch.sigmoid(x)

index = torch.randint(col, size = (row,), dtype=torch.long)
y = torch.zeros(row, col)
y[range(y.shape[0]), index] = 1
print(range(y.shape[0]))

loss1 = F.binary_cross_entropy(predict, y)
print("loss1 ",loss1)

def binary_cross_entropy(predict, y):
    loss = -torch.sum(y * torch.log(predict) + (1 - y) * torch.log(1 - predict)) / torch.numel(y)
    return loss

loss2 = binary_cross_entropy(predict, y)
print("loss2 ", loss2)

range(0, 2)
loss1 tensor(1.1794)
loss2 tensor(1.1794)

F.binary_cross_entropy_with_logits

相比F.binary_cross_entropy函数，F.binary_cross_entropy_with_logits函数在内部使用了sigmoid函数.

F.binary_cross_entropy_with_logits = sigmoid + F.binary_cross_entropy。

`F.cross_entropy`

多分类交叉熵损失函数

F.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

参数说明：

input：(N, C)形状的张量，其中N为Batch_size，C为类别数。该参数对应于神经网络最后一个全连接层输出的未经Softmax处理的结果。
target：一个大小为(N,)张量，其值是0 <= targets[i] <= C-1的元素，其中C是类别数。该参数包含一组给定的真实标签（ground truth）。
weight：采用类别平衡的加权方式计算损失值。可以传入一个大小为(C,)张量，其中weight[j]是类别j的权重。默认为None。
size_average：弃用，可以忽略。该参数已经被reduce参数取代了。
ignore_index：指定被忽略的目标值的索引。如果目标值等于该索引，则不计算该样本的损失。默认值为-100，即不忽略任何目标值。
reduce：指定返回的损失值的方式。可以是“None”（不返回损失值）、“mean”（返回样本损失值的平均值）和“sum”（返回样本损失值的总和）。默认值为“mean”。
reduction：与reduce参数等价。表示返回的损失值的方式。默认值为“mean”。

F.cross_entropy函数与nn.CrossEntropyLoss类是相似的，但前者更适合于控制更多的细节。

由于内部已经使用SoftMax函数处理，故两者的输入都不需要使用SoftMax处理。

import torch
import torch.nn.functional as F
import math


# redictions：一个2维tensor，表示模型的预测结果。它的形状是(2, 3)，其中2是样本数量，3是类别数量。每一行对应一个样本的预测结果，每个元素表示该类别的概率。
# labels：一个1维tensor，表示样本的真实标签。它的形状是(2,)，其中2是样本数量。
predictions = torch.tensor([[0.2, 0.3, 0.5], [0.8, 0.1, 0.1]])
labels = torch.tensor([2, 0])

# 定义每个类别的权重
weights = torch.tensor([1.0, 2.0, 3.0])

# 使用F.cross_entropy计算带权重的交叉熵损失
loss = F.cross_entropy(predictions, labels, weight=weights)
print(loss) # tensor(0.8773)

# 测试计算过程
pred = F.softmax(predictions, dim=1)
loss2 = -(weights[2] * math.log(pred[0, 2]) + weights[0]*math.log(pred[1, 0]))/4  # 4 = 1+3 对应权重之和
print(loss2) # 0.8773049571540321

# 对于第一个样本，它的预测结果为pred[0]，真实标签为2。根据交叉熵损失的定义，我们可以计算出它的损失为：
# -weights[2] * math.log(pred[0, 2])
# 对于第二个样本，它的预测结果为pred[1]，真实标签为0。根据交叉熵损失的定义，我们可以计算出它的损失为：
# -weights[0] * math.log(pred[1, 0])

torch.nn.BCELoss

torch.nn.BCELoss(
weight=None,
size_average=None,
reduce=None,
reduction='mean' # 默认计算的是批量样本损失的平均值,还可以为'sum'或者'none'
)

torch.nn.BCEWithLogitsLoss

torch.nn.BCEWithLogitsLoss(
weight=None,
size_average=None,
reduce=None,
reduction='mean', # 默认计算的是批量样本损失的平均值,还可以为'sum'或者'none'
pos_weight=None
)

import numpy as np
import torch
from torch import nn
import torch.nn.functional as F

y = torch.tensor([1, 0, 1], dtype=torch.float)
y_hat = torch.tensor([0.8, 0.2, 0.4], dtype=torch.float)


bce_loss = nn.BCELoss()

# nn.BCELoss()需要先对输入数据进行sigmod
print("官方BCELoss = ", bce_loss(torch.sigmoid(y_hat), y))

# nn.BCEWithLogitsLoss()不需要自己sigmod
bcelogits_loss = nn.BCEWithLogitsLoss()
print("官方BCEWithLogitsLoss = ", bcelogits_loss(y_hat, y))

# 我们根据二分类交叉熵损失函数实现：
def loss(y_hat, y):
    y_hat = torch.sigmoid(y_hat)
    l = -(y * np.log(y_hat) + (1 - y) * np.log(1 - y_hat))
    l = sum(l) / len(l)
    return l

print('自己实现Loss = ', loss(y_hat, y))

# 可以看到结果值相同
官方BCELoss = tensor(0.5608)
官方BCEWithLogitsLoss = tensor(0.5608)
自己实现Loss = tensor(0.5608)

BCELoss与BCEWithLogitsLoss的关联：BCEWithLogitsLoss = Sigmoid + BCELoss

import numpy as np
import torch
from torch import nn
import torch.nn.functional as F

y = torch.tensor([1, 0, 1], dtype=torch.float)
y_hat = torch.tensor([0.8, 0.2, 0.4], dtype=torch.float)


bce_loss = nn.BCELoss()

# nn.BCELoss()需要先对输入数据进行sigmod
print("官方BCELoss = ", bce_loss(torch.sigmoid(y_hat), y))

# nn.BCEWithLogitsLoss()不需要自己sigmod
bcelogits_loss = nn.BCEWithLogitsLoss()
print("官方BCEWithLogitsLoss = ", bcelogits_loss(y_hat, y))

# 我们根据二分类交叉熵损失函数实现：
def loss(y_hat, y):
    y_hat = torch.sigmoid(y_hat)
    l = -(y * np.log(y_hat) + (1 - y) * np.log(1 - y_hat))
    l = sum(l) / len(l)
    return l

print('自己实现Loss = ', loss(y_hat, y))

# 可以看到结果值相同
官方BCELoss = tensor(0.5608)
官方BCEWithLogitsLoss = tensor(0.5608)
自己实现Loss = tensor(0.5608)

torch.nn.CrossEntropyLoss()

torch.nn.CrossEntropyLoss(
weight=None,
size_average=None,
ignore_index=-100,
reduce=None,
reduction='mean',
label_smoothing=0.0 # 同样，默认计算的是批量样本损失的平均值,还可以为'sum'或者'none'
)

在关于二分类的问题中，输入交叉熵公式的网络预测值必须经过Sigmoid进行映射
而在多分类问题中，需要将Sigmoid替换成Softmax，这是两者的一个重要区别
CrossEntropyLoss = softmax+log+nll_loss的集成

多分类交叉熵损失函数

cross_loss = nn.CrossEntropyLoss(reduction="none") # 设置为none，这里输入每个样本的loss值，不计算平均值
target = torch.tensor([0, 1, 2])

predict = torch.tensor([[0.9, 0.2, 0.8],
                        [0.5, 0.2, 0.4],
                        [0.4, 0.2, 0.9]]) # 未经过softmax
print('官方实现CrossEntropyLoss: ', cross_loss(predict, target))


# 自己实现方便理解版本的CrossEntropyLoss
def cross_loss(predict, target, reduction=None):
    all_loss = []
    for index, value in enumerate(target):
        # 利用多分类简化后的公式，对每一个样本求loss值
        loss = -predict[index][value] + torch.log(sum(torch.exp(predict[index])))
        all_loss.append(loss)
    all_loss = torch.stack(all_loss)
    if reduction == 'none':
        return all_loss
    else:
        return torch.mean(all_loss)

print('实现方便理解的CrossEntropyLoss: ', cross_loss(predict, target, reduction='none'))

# 利用F.nll_loss实现的CrossEntropyLoss
def cross_loss2(predict, target, reduction=None):
    # Softmax的缺点：
    # 1、如果有得分值特别大的情况，会出现上溢情况；
    # 2、如果很小的负值很多，会出现下溢情况（超出精度范围会向下取0），分母为0，导致计算错误。
    # 引入log_softmax可以解决上溢和下溢问题
    logsoftmax = F.log_softmax(predict)
    print('target = ', target)
    print('logsoftmax:\n', logsoftmax)
    # nll_loss不是将标签值转换为one-hot编码，而是根据target的值，索引到对应元素，然后取相反数。
    loss = F.nll_loss(logsoftmax, target, reduction=reduction)
    return loss

print('F.nll_loss实现的CrossEntropyLoss: ', cross_loss2(predict, target, reduction='none'))

官方实现CrossEntropyLoss: tensor([0.8761, 1.2729, 0.7434])
实现方便理解的CrossEntropyLoss: tensor([0.8761, 1.2729, 0.7434])

target = tensor([0, 1, 2])
logsoftmax:
tensor([[-0.8761, -1.5761, -0.9761],
[-0.9729, -1.2729, -1.0729],
[-1.2434, -1.4434, -0.7434]])
F.nll_loss实现的CrossEntropyLoss: tensor([0.8761, 1.2729, 0.7434])