暑期SMALE魔鬼训练day5
上午:
写出上述 Definition 4中的
U
,
C
,
D
和
U
\mathbf{U}, \mathbf{C}, \mathbf{D}和\mathbf{U}
U,C,D和U 注:最后两个属性为决策属性.
U
=
{
x
1
,
x
2
,
…
,
x
7
}
\mathbf{U} = \{x_1, x_2, \dots, x_7\}
U={x1,x2,…,x7}
C
\mathbf{C}
C = {Headache, Temperature, Lymphocyte, Leukocyte, Eosinophil}
D
\mathbf{D}
D = {Heartbeat, Flu}
V
\mathbf{V}
V = {Normal, Abnormal, Yes, No}
定义一个标签分布系统:
A label distribution system is a tuple
S
=
(
X
,
Y
)
S = (\mathbf{X},\mathbf{Y})
S=(X,Y)where
X
=
[
x
i
j
]
n
×
m
∈
R
n
×
m
\mathbf{X} = [x_{ij}]_{n \times m} \in \mathbb{R}^{n \times m}
X=[xij]n×m∈Rn×m
is the data matrix,
Y
=
[
y
i
k
]
n
×
l
∈
[
0
,
1
]
n
×
l
\mathbf{Y} = [y_{ik}]_{n \times l} \in [0, 1]^{n \times l}
Y=[yik]n×l∈[0,1]n×lis the label matrix, and
∑
k
=
1
l
y
i
k
=
1
\sum_{k = 1}^l y_{ik} = 1
∑k=1lyik=1, where
n
n
n is the number of instances,
m
m
m is the number of features, and
l
l
l is the number of labels.