范数
1. 向量范数
对于向量 x = ( x 1 , x 2 , … , x n ) \mathbf{x}=(x_1, x_2, \dots, x_n) x=(x1,x2,…,xn)
L p L_p Lp范数
L
p
L_p
Lp范数是一系列范数的一般表示形式,包括
L
0
L_0
L0范数,
L
1
L_1
L1范数,
L
2
L_2
L2范数…
∥
x
∥
p
=
∑
i
∣
x
i
∣
p
p
\|\mathbf{x}\|_p=\sqrt[p]{\sum_i{|x_i|^p}}
∥x∥p=pi∑∣xi∣p
1.1 L 0 L_0 L0范数
∥
x
∥
0
=
∑
i
∣
x
i
∣
0
0
=
∣
{
1
≤
i
≤
n
∣
x
i
≠
0
}
∣
\|\mathbf{x}\|_0=\sqrt[0]{\sum_i{|x_i|^0}}=\vert\{1\leq i\leq n \vert x_i\neq0\}\vert
∥x∥0=0i∑∣xi∣0=∣{1≤i≤n∣xi=0}∣
表示向量中的非零元素个数
1.2 L 1 L_1 L1范数
∥
x
∥
1
=
∑
i
∣
x
i
∣
1
1
=
∑
i
∣
x
i
∣
\|\mathbf{x}\|_1=\sqrt[1]{\sum_i{|x_i|^1}}=\sum_i{|x_i|}
∥x∥1=1i∑∣xi∣1=i∑∣xi∣
表示向量中元素的绝对值之和
1.3 L 2 L_2 L2范数
∥ x ∥ 1 = ∑ i ∣ x i ∣ 2 2 \|\mathbf{x}\|_1=\sqrt[2]{\sum_i{|x_i|^2}} ∥x∥1=2i∑∣xi∣2
可以类比为向量 x \mathbf{x} x与原点之间的欧氏距离。
1.4 L ∞ L_{\infty} L∞范数
∥
x
∥
∞
=
max
(
∣
x
i
∣
)
\|\mathbf{x}\|_{\infty}=\max{(|x_i|)}
∥x∥∞=max(∣xi∣)
正无穷范数表示求取向量元素绝对值中的最大值
1.5 L − ∞ L_{-\infty} L−∞范数
∥
x
∥
−
∞
=
min
(
∣
x
i
∣
)
\|\mathbf{x}\|_{-\infty}=\min{(|x_i|)}
∥x∥−∞=min(∣xi∣)
负无穷范数表示求取所有向量元素绝对值中的最小值
2. 矩阵范数
对于矩阵 A ∈ R m × n \mathbf{A}\in \mathbb{R}^{m\times n} A∈Rm×n
2.1 1-范数
∥
A
∥
1
=
max
j
∑
i
=
1
m
∣
a
i
j
∣
\|\mathbf{A}\|_1=\max_j{\sum_{i=1}^m{|a_{ij}|}}
∥A∥1=jmaxi=1∑m∣aij∣
矩阵元素也可以表示为:
a
i
,
j
a_{i,j}
ai,j
2.2 2-范数
∥
A
∥
2
=
λ
1
2
\Vert\mathbf{A}\Vert_2=\sqrt[2]{\lambda_1}
∥A∥2=2λ1
其中
λ
1
\lambda_1
λ1表示
A
T
A
\mathbf{A}^{\mathrm{T}}\mathbf{A}
ATA的最大特征值,称为谱函数。
2.3 ∞ \infty ∞范数
∥ A ∥ ∞ = max i ∑ j = 1 n ∣ a i j ∣ \|\mathbf{A}\|_{\infty}=\max_i{\sum_{j=1}^n{|a_{ij}|}} ∥A∥∞=imaxj=1∑n∣aij∣
2.4 Fro(Frobenius)范数
F-范数表示方法是否粗体问题,参考已有的论文1,可以使用粗体。
∥
A
∥
F
=
(
∑
i
=
1
m
∑
j
=
1
n
a
i
j
2
)
1
2
\|\mathbf{A}\|_{\mathbf{F}}=(\sum_{i=1}^m{\sum_{j=1}^n{a_{ij}}^2})^{\frac{1}{2}}
∥A∥F=(i=1∑mj=1∑naij2)21
经常取其平方, 即
∥
A
∥
F
2
=
(
∑
i
=
1
m
∑
j
=
1
n
a
i
j
2
)
\|\mathbf{A}\|_{\mathbf{F}}^2=(\sum_{i=1}^m{\sum_{j=1}^n{a_{ij}}^2})
∥A∥F2=(i=1∑mj=1∑naij2)
2.5 核范数
核范数是矩阵奇异值的和,用于约束矩阵的低秩,对于稀疏性质的数据言,其矩阵是低秩且会包含大量冗余信息,这些信息可被用于恢复数据和提取特征。2
2.6 l 2 , 1 l_{2,1} l2,1范数3
对每个行向量求
l
2
l_2
l2范数,再对列向量求
l
1
l_1
l1范数。
∥
A
∥
2
,
1
=
∑
i
=
1
m
∑
j
=
1
n
∣
a
i
j
∣
2
\Vert\mathbf{A}\Vert_{2,1}=\sum_{i=1}^m\sqrt{\sum_{j=1}^n\vert a_{ij}\vert^2}
∥A∥2,1=i=1∑mj=1∑n∣aij∣2
3. pytorch计算范数4
配置pytorch环境,参见上期博客。
def norm(input, p="fro", dim=None, keepdim=False, out=None, dtype=None): # noqa: F811
r"""Returns the matrix norm or vector norm of a given tensor.
.. warning::
torch.norm is deprecated and may be removed in a future PyTorch release.
Use :func:`torch.linalg.norm`, instead, or :func:`torch.linalg.vector_norm`
when computing vector norms and :func:`torch.linalg.matrix_norm` when
computing matrix norms. Note, however, the signature for these functions
is slightly different than the signature for torch.norm.
Args:
input (Tensor): The input tensor. Its data type must be either a floating
point or complex type. For complex inputs, the norm is calculated using the
absolute value of each element. If the input is complex and neither
:attr:`dtype` nor :attr:`out` is specified, the result's data type will
be the corresponding floating point type (e.g. float if :attr:`input` is
complexfloat).
p (int, float, inf, -inf, 'fro', 'nuc', optional): the order of norm. Default: ``'fro'``
The following norms can be calculated:
====== ============== ==========================
ord matrix norm vector norm
====== ============== ==========================
'fro' Frobenius norm --
'nuc' nuclear norm --
Number -- sum(abs(x)**ord)**(1./ord)
====== ============== ==========================
The vector norm can be calculated across any number of dimensions.
The corresponding dimensions of :attr:`input` are flattened into
one dimension, and the norm is calculated on the flattened
dimension.
Frobenius norm produces the same result as ``p=2`` in all cases
except when :attr:`dim` is a list of three or more dims, in which
case Frobenius norm throws an error.
Nuclear norm can only be calculated across exactly two dimensions.
dim (int, tuple of ints, list of ints, optional):
Specifies which dimension or dimensions of :attr:`input` to
calculate the norm across. If :attr:`dim` is ``None``, the norm will
be calculated across all dimensions of :attr:`input`. If the norm
type indicated by :attr:`p` does not support the specified number of
dimensions, an error will occur.
keepdim (bool, optional): whether the output tensors have :attr:`dim`
retained or not. Ignored if :attr:`dim` = ``None`` and
:attr:`out` = ``None``. Default: ``False``
out (Tensor, optional): the output tensor. Ignored if
:attr:`dim` = ``None`` and :attr:`out` = ``None``.
dtype (:class:`torch.dtype`, optional): the desired data type of
returned tensor. If specified, the input tensor is casted to
:attr:'dtype' while performing the operation. Default: None.
.. note::
Even though ``p='fro'`` supports any number of dimensions, the true
mathematical definition of Frobenius norm only applies to tensors with
exactly two dimensions. :func:`torch.linalg.norm` with ``ord='fro'`` aligns
with the mathematical definition, since it can only be applied across
exactly two dimensions.
Example::
>>> import torch
>>> a = torch.arange(9, dtype= torch.float) - 4
>>> b = a.reshape((3, 3))
>>> torch.norm(a)
tensor(7.7460)
>>> torch.norm(b)
tensor(7.7460)
>>> torch.norm(a, float('inf'))
tensor(4.)
>>> torch.norm(b, float('inf'))
tensor(4.)
>>> c = torch.tensor([[ 1, 2, 3],[-1, 1, 4]] , dtype= torch.float)
>>> torch.norm(c, dim=0)
tensor([1.4142, 2.2361, 5.0000])
>>> torch.norm(c, dim=1)
tensor([3.7417, 4.2426])
>>> torch.norm(c, p=1, dim=1)
tensor([6., 6.])
>>> d = torch.arange(8, dtype= torch.float).reshape(2,2,2)
>>> torch.norm(d, dim=(1,2))
tensor([ 3.7417, 11.2250])
>>> torch.norm(d[0, :, :]), torch.norm(d[1, :, :])
(tensor(3.7417), tensor(11.2250))
"""
import torch
import cmath
x = torch.arange(9, dtype=torch.float) - 4
y = x.reshape((3, 3))
# 默认是Fro范数
print("torch.norm(x) = {}".format(torch.norm(x)))
print("torch.norm(y) = {}".format(torch.norm(y)))
sum = 0.
for i in x:
sum += i**2
print(cmath.sqrt(sum))
# 无穷范数
print("torch.norm(y, float('inf')) = {}".format(torch.norm(y, float('inf'))))
print("torch.norm(y, float('-inf')) = {}".format(torch.norm(y, float('-inf'))))
c = torch.tensor([[1, 2, 3], [-1, 1, 4]], dtype=torch.float)
print("torch.norm(c, dim=0) = {}".format(torch.norm(c, dim=0)))
print("torch.norm(c, dim=1) = {}".format(torch.norm(c, dim=1)))
print("torch.norm(c, p=1, dim=1) = {}".format(torch.norm(c, p=1, dim=1)))
d = torch.arange(8, dtype=torch.float).reshape(2, 2, 2)
print("torch.norm(d[0, :, :]) = {}".format( torch.norm(d[0, :, :])))
运行结果:
torch.norm(x) = 7.745966911315918
torch.norm(y) = 7.745966911315918
(7.745966692414834+0j)
torch.norm(y, float(‘inf’)) = 4.0
torch.norm(y, float(’-inf’)) = 0.0
torch.norm(c, dim=0) = tensor([1.4142, 2.2361, 5.0000])
torch.norm(c, dim=1) = tensor([3.7417, 4.2426])
torch.norm(c, p=1, dim=1) = tensor([6., 6.])
torch.norm(d, dim=(1,2)) = tensor([ 3.7417, 11.2250])
pytorch的norm方法注释中解释的已经很清楚了,值得注意的是注释中提到torch.norm() is deprecated,意思就是该方法不再维护了。 l 2 , 1 l_{2,1} l2,1范数的计算可以通过控制dim参数分两次计算。