文章目录
evaluation metrics for DL tasks
*** CS230 Section 7 (Week 7): Advanced Evaluation Metrics
- Warmup: Classification and the F1 Score
- Accuracy
- Confusion Matrix
- Precision, Recall, and the F1 Score
- Object Detection: IoU, AP, and mAP
- Intersection over Union (IoU)
- Average Precision (AP): the Area Under Curve (AUC)
- Mean Average Precision (mAP)
- Evaluation Metrics for NLP Tasks
- Evaluations Metrics for GANs
Deep Neural Networks for Regression Problems
Deep Neural Networks for Regression Problems | by Mohammed AL-Ma’amari | Towards Data Science 20180929
Neural Networks for Regression (Part 1)—Overkill or Opportunity? - MissingLink.ai
anchor的尺寸计算
if we have pooled our image from 800 px to 50px, the sub_sample equals 16; the sub_sample corresponding to {C1, C2, C3, C4, C5} will be {2, 4, 8, 16, 32}; the sub_sample corresponding to {P2, P3, P4, P5, P6} will be {4, 8, 16, 32, 64};
由
h
×
w
=
(
b
a
s
e
_
s
i
z
e
×
s
c
a
l
e
)
2
,
h
w
=
r
a
t
i
o
h \times w = (base\_size \times scale)^2, \frac{h}{w} = ratio
h×w=(base_size×scale)2,wh=ratio
得:
h
=
b
a
s
e
_
s
i
z
e
×
s
c
a
l
e
×
r
a
t
i
o
h = base\_size \times scale \times \sqrt{ratio}
h=base_size×scale×ratio
w
=
b
a
s
e
_
s
i
z
e
×
s
c
a
l
e
×
1
r
a
t
i
o
w = base\_size \times scale \times \frac{1}{\sqrt{ratio}}
w=base_size×scale×ratio1
strides
=[4, 8, 16, 32, 64], # The strides of anchors in multiple feature levels. This is consistent with the FPN feature strides. The strides will be taken as base_sizes if base_sizes is not set.ratios
=[0.5, 1.0, 2.0], # The ratio between height and width.scales
=[8], # Basic scale of the anchor in a single level, the area of the anchor in one position of a feature map will be scale * base_sizes.base_sizes
(list[int] | None): The basic sizes of anchors in multiple levels. If None is given, strides will be used as base_sizes. (If strides are non square, the shortest stride is taken.)
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
损失函数
Focal Loss
RetinaNet中是使用Binary Cross Entropy Function来处理the multi-class case;
RetinaNet论文中P3页脚注提到"1Extending the focal loss to the multi-class case is straightforward and works well; for simplicity we focus on the binary loss in this work."
mmdet/models/losses/focal_loss.py
# This method is only for debugging
def py_sigmoid_focal_loss(pred,
target,
weight=None,
gamma=2.0,
alpha=0.25,
reduction='mean',
avg_factor=None):
"""PyTorch version of `Focal Loss <https://arxiv.org/abs/1708.02002>`_.
Args:
pred (torch.Tensor): The prediction with shape (N, C), C is the
number of classes
target (torch.Tensor): The learning label of the prediction.
weight (torch.Tensor, optional): Sample-wise loss weight.
gamma (float, optional): The gamma for calculating the modulating
factor. Defaults to 2.0.
alpha (float, optional): A balanced form for Focal Loss.
Defaults to 0.25.
reduction (str, optional): The method used to reduce the loss into
a scalar. Defaults to 'mean'.
avg_factor (int, optional): Average factor that is used to average
the loss. Defaults to None.
"""
pred_sigmoid = pred.sigmoid()
target = target.type_as(pred)
pt = (1 - pred_sigmoid) * target + pred_sigmoid * (1 - target)
focal_weight = (alpha * target + (1 - alpha) *
(1 - target)) * pt.pow(gamma)
loss = F.binary_cross_entropy_with_logits(
pred, target, reduction='none') * focal_weight
loss = weight_reduce_loss(loss, weight, reduction, avg_factor)
return loss
卷积操作的输入输出尺寸计算
nn.Conv2d的尺寸计算
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
s
t
r
i
d
e
[
0
]
+
1
⌋
H_{out} = \lfloor \frac{H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1) }{stride[0]} + 1 \rfloor
Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)+1⌋
其中, dilation[0]是卷积核之间的间距(默认为1);
IoU w.r.t. [t,b,l,r] of pred的导数
I
o
U
=
G
⋂
P
G
⋃
P
=
G
⋂
P
G
+
P
−
G
⋂
P
=
I
U
IoU=\frac{G \bigcap P}{G \bigcup P}=\frac{G \bigcap P}{G + P - G \bigcap P}=\frac{I}{U}
IoU=G⋃PG⋂P=G+P−G⋂PG⋂P=UI
∂
I
o
U
∂
p
=
∂
I
/
∂
p
⋅
U
−
∂
U
/
∂
p
⋅
I
U
2
,
p
∈
[
t
,
b
,
l
,
r
]
\frac{\partial IoU}{\partial p}=\frac{\partial I/\partial p \cdot U - \partial U/\partial p \cdot I}{U^2}, p\in[t,b,l,r]
∂p∂IoU=U2∂I/∂p⋅U−∂U/∂p⋅I,p∈[t,b,l,r]
∂
U
∂
p
=
∂
(
G
+
P
−
I
)
∂
p
=
∂
P
/
∂
p
−
∂
I
/
∂
p
,
p
∈
[
t
,
b
,
l
,
r
]
\frac{\partial U}{\partial p}=\frac{\partial (G + P - I)}{\partial p}=\partial P/\partial p - \partial I/\partial p, p\in[t,b,l,r]
∂p∂U=∂p∂(G+P−I)=∂P/∂p−∂I/∂p,p∈[t,b,l,r]
I
b
o
x
Ibox
Ibox指的是G与P的交集box,
∂
I
∂
t
=
∂
(
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
⋅
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
)
∂
t
=
{
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
,
if
p
r
e
d
.
t
>
g
t
.
t
0
,
else
\frac{\partial I}{\partial t} =\frac{\partial ((Ibox.r-Ibox.l) \cdot (Ibox.b-Ibox.t))}{\partial t}= \begin{cases} (Ibox.r-Ibox.l), & \text{if $pred.t$ > $gt.t$} \\ 0, & \text{else} \end{cases}
∂t∂I=∂t∂((Ibox.r−Ibox.l)⋅(Ibox.b−Ibox.t))={(Ibox.r−Ibox.l),0,if pred.t > gt.telse
∂
I
∂
b
=
∂
(
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
⋅
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
)
∂
t
=
{
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
,
if
p
r
e
d
.
b
<
g
t
.
b
0
,
else
\frac{\partial I}{\partial b} =\frac{\partial ((Ibox.r-Ibox.l) \cdot (Ibox.b-Ibox.t))}{\partial t}= \begin{cases} (Ibox.r-Ibox.l), & \text{if $pred.b$ < $gt.b$} \\ 0, & \text{else} \end{cases}
∂b∂I=∂t∂((Ibox.r−Ibox.l)⋅(Ibox.b−Ibox.t))={(Ibox.r−Ibox.l),0,if pred.b < gt.belse
∂
I
∂
l
=
∂
(
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
⋅
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
)
∂
t
=
{
−
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
,
if
p
r
e
d
.
l
>
g
t
.
l
0
,
else
\frac{\partial I}{\partial l} =\frac{\partial ((Ibox.r-Ibox.l) \cdot (Ibox.b-Ibox.t))}{\partial t}= \begin{cases} - (Ibox.b-Ibox.t), & \text{if $pred.l$ > $gt.l$} \\ 0, & \text{else} \end{cases}
∂l∂I=∂t∂((Ibox.r−Ibox.l)⋅(Ibox.b−Ibox.t))={−(Ibox.b−Ibox.t),0,if pred.l > gt.lelse
∂
I
∂
r
=
d
(
(
I
b
o
x
.
r
−
I
b
o
x
.
l
)
⋅
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
)
∂
t
=
{
(
I
b
o
x
.
b
−
I
b
o
x
.
t
)
,
if
p
r
e
d
.
r
<
g
t
.
r
0
,
else
\frac{\partial I}{\partial r} =\frac{d((Ibox.r-Ibox.l) \cdot (Ibox.b-Ibox.t))}{\partial t}= \begin{cases} (Ibox.b-Ibox.t), & \text{if $pred.r$ < $gt.r$} \\ 0, & \text{else} \end{cases}
∂r∂I=∂td((Ibox.r−Ibox.l)⋅(Ibox.b−Ibox.t))={(Ibox.b−Ibox.t),0,if pred.r < gt.relse
DIoU w.r.t. [x,y,w,h] of pred的导数
D
I
o
U
=
I
o
U
−
ρ
2
(
G
,
P
)
c
2
DIoU=IoU - \frac{\rho ^2(G, P)}{c^2}
DIoU=IoU−c2ρ2(G,P)
∂
D
I
o
U
∂
x
=
∂
I
o
U
∂
x
−
2
ρ
c
⋅
(
∂
ρ
/
∂
x
⋅
c
−
∂
c
/
∂
x
⋅
ρ
)
c
2
,
x
∈
[
c
t
r
x
,
c
t
r
y
,
w
,
h
]
\frac{\partial DIoU}{\partial x}=\frac{\partial IoU}{\partial x} - \frac{2\rho}{c}\cdot \frac{(\partial \rho/\partial x \cdot c - \partial c/\partial x \cdot \rho)}{c^2}, x\in[ctr_x,ctr_y,w,h]
∂x∂DIoU=∂x∂IoU−c2ρ⋅c2(∂ρ/∂x⋅c−∂c/∂x⋅ρ),x∈[ctrx,ctry,w,h]
l
←
l
−
η
⋅
∇
l
=
l
−
η
⋅
∂
I
o
U
∂
l
l \leftarrow l - \eta \cdot \nabla l = l - \eta \cdot \frac{\partial IoU}{\partial l}
l←l−η⋅∇l=l−η⋅∂l∂IoU
r
←
r
−
η
⋅
∇
r
=
r
−
η
⋅
∂
I
o
U
∂
r
r \leftarrow r - \eta \cdot \nabla r = r - \eta \cdot \frac{\partial IoU}{\partial r}
r←r−η⋅∇r=r−η⋅∂r∂IoU
由于
x
=
(
l
+
r
)
2
x=\frac{(l + r)}{2}
x=2(l+r),故得,
x
←
x
−
η
⋅
∇
x
=
x
−
η
⋅
(
∇
l
+
∇
r
)
2
x \leftarrow x - \eta \cdot \nabla x = x - \eta \cdot \frac{(\nabla l + \nabla r)}{2}
x←x−η⋅∇x=x−η⋅2(∇l+∇r)
例:已知
z
=
f
(
l
,
r
,
t
,
b
)
=
g
(
x
,
y
,
w
,
h
)
z=f(l,r,t,b)=g(x,y,w,h)
z=f(l,r,t,b)=g(x,y,w,h),
∂
z
∂
l
=
L
\frac{\partial z}{\partial l}=L
∂l∂z=L,
∂
z
∂
r
=
R
\frac{\partial z}{\partial r}=R
∂r∂z=R,
x
=
l
+
r
2
x=\frac{l+r}{2}
x=2l+r, 求
∂
z
∂
x
=
?
\frac{\partial z}{\partial x}=?
∂x∂z=?
此处,z与l,r之间的关系 以及 z与x之间的关系 是分别各由两个函数
f
(
⋅
)
f(\cdot)
f(⋅)和
g
(
⋅
)
g(\cdot)
g(⋅)确定的,因此不符合复合函数求导的链式法则,也就不存在以下关系:
∂
z
∂
l
=
∂
z
∂
x
⋅
∂
x
∂
l
,
∂
z
∂
r
=
∂
z
∂
x
⋅
∂
x
∂
r
\frac{\partial z}{\partial l}=\frac{\partial z}{\partial x}\cdot \frac{\partial x}{\partial l}, \frac{\partial z}{\partial r}=\frac{\partial z}{\partial x}\cdot \frac{\partial x}{\partial r}
∂l∂z=∂x∂z⋅∂l∂x,∂r∂z=∂x∂z⋅∂r∂x。
如何寻找多个矩形框的重叠区域
python - Efficient way to find overlapping of N rectangles - Stack Overflow
Finding the area of intersection of multiple overlapping rectangles in Python - Stack Overflow
优化-什么是找到重叠矩形区域的有效算法 - ITranslater
待学习的内容
- Random walk随机游走
科学网—[转载]随机游走 Random Walk - 张伟的博文 20200828
图上随机游走算法II: frustrated random walks - 知乎 20210410 - xxx
待思考了解的内容
- 权值初始化、正则化方法(dropout, bn)如何影响模型性能;
- 网络模型的推理速度,内存消耗怎么测量;
二级标题
待补充
待补充
数学公式粗体
\textbf{}
或者
m
e
m
o
r
y
{\bf memory}
memory
数学公式粗斜体
\bm{}
摘录自“bookname_author”
此文系转载,原文链接:名称 20200505
高亮颜色说明:突出重点
个人觉得,:待核准个人观点是否有误
分割线
分割线
我是颜色为00ffff的字体
我是字号为2的字体
我是颜色为00ffff, 字号为2的字体
我是字体类型为微软雅黑, 颜色为00ffff, 字号为2的字体
分割线
分割线
问题描述:
原因分析:
解决方案: