# torch.nn.LayerNorm(normalized_shape, eps)对一个batch的数据进行归一化,即均值为0方差为1.(归一化不改变张量的shape) # 其中normalized_shape指定需要进行归一化的数据形状,eps是一个很小的正数防止除以0的情况。 # 创建两个随机张量并转化为浮点数 a1 = torch.randint(0, 10, (3, 4)).to(torch.float32) a2 = torch.randint(0, 10, (2, 3, 4)).to(torch.float32) a1_1 = nn.LayerNorm(4, eps=1e-5) # 创建一个层归一化器,指定归一化数据的形状为4,即长度为4的一维张量 normalized_a1 = a1_1(a1) # 使用定义好的层归一化器对张量a1进行层归一化 print(normalized_a1) """ tensor([[-0.9733, -0.9733, 0.6489, 1.2978], [ 1.3628, -0.7338, -1.1531, 0.5241], [ 1.2309, -0.9847, -0.9847, 0.7385]], grad_fn=<NativeLayerNormBackward0>) """ print(normalized_a1.shape) # torch.Size([3, 4]):归一化不改变张量的shape a2_1 = nn.LayerNorm([3, 4], eps=1e-5) # 创建一个层归一化器,指定归一化数据的形状为[3, 4],即三行四列的二维张量 normalized_a2 = a2_1(a2) # 使用创建好的归一化器对a2进行归一化 print(normalized_a2) """ tensor([[[ 0.6562, -1.1334, -0.7755, 1.0141], [-0.4176, -2.2072, -0.7755, 0.6562], [ 0.6562, 1.0141, 0.2983, 1.0141]], [[ 0.1690, 0.5747, 0.5747, -0.6423], [-1.0480, -1.0480, 0.9804, 1.3861], [-0.6423, 0.1690, -1.8593, 1.3861]]], grad_fn=<NativeLayerNormBackward0>) """ print(normalized_a2.shape) # torch.Size([2, 3, 4]) # 定义一个计算张量的均值方差函数 def compute_mean_var(tensor): mean = tensor.mean().item() var = tensor.std().item() return mean, var a1_chunked = normalized_a1.chunk(3, dim=0) # 将normalized_a1在dim=0分成3块,每个分块张量的shape: [1, 4] print(a1_chunked[0].shape) # torch.Size([1, 4]),归一化器的归一化数据形状为: (4,) compute_mean_var(a1_chunked[0]) # (5.960464477539063e-08, 1.1546999216079712) ~= (0, 1) torch.sum(a1_chunked[0]) #借助torch.sum(tensor)计算分第一个块张量中所有元素的总和:2.3842e-07 ~= 0 a2_chunked = normalized_a2.chunk(2, dim=0) # 将normalized_a2在dim=0分成两块,返回两个分块张量组成的元组,每一个的张量shape:[1, 3, 4] print(a2_chunked[0].shape) # torch.Size([1, 3, 4]),归一化器的归一化数据形状为: [3, 4] compute_mean_var(a2_chunked[0]) # (1.1920928955078125e-07, 1.0444653034210205) ~= (0, 1) #[注:类如1e-5是python中的科学计数法,其中1表示乘积因子,-5表示10的幂数,对应浮点数类型。故1e-5实为1 * 10**(-5) == 0.00001] """ 总结:一般地,对torch.nn.LayerNorm(normalized_shape, eps),归一化数据的shape为normalized_shape,若张量x.shape: [N, *normalized_shape], *表示不含[] 表示对张量x中的N个normalized_shape作归一化,如如normalized_shape: [3, 4] 而x.shape: [2, 3, 4].即对两个shape为[2, 3]的二维张量作归一化再整合起来。 同时我们也可看出归一化数据的形状一般是数据形状的一部分,且靠右(从后向前)对齐,如normalized_shape: [3, 4] 而x.shape: [2, 3, 4]. """
“相关推荐”对你有帮助么?
-
非常没帮助
-
没帮助
-
一般
-
有帮助
-
非常有帮助
提交