1.根据决策表定义,写出下图中的 U , C , D , V \mathbf{U}, \mathbf{C}, \mathbf{D} , \mathbf{V} U,C,D,V 和 I I I 注: 最后两个属性为决策属性.
Patient | Headache | Temperature | Lymphocyte | Leukocyte | Eosinophil | Heartbeat | Flu |
---|---|---|---|---|---|---|---|
x 1 x_1 x1 | Yes | High | High | High | High | Normal | Yes |
x 2 x_2 x2 | Yes | High | Normal | High | High | Abnormal | Yes |
x 3 x_3 x3 | Yes | High | High | High | Normal | Abnorma | Yes |
x 4 x_4 x4 | No | High | Normal | Normal | Normal | Normal | No |
x 5 x_5 x5 | Yes | Normal | Normal | Low | High | Abnorma | No |
x 6 x_6 x6 | Yes | Normal | Low | High | Normal | Abnorma | No |
x 7 x_7 x7 | Yes | Low | Low | High | Normal | Normal | Yes |
答:实例集合
U
=
{
x
1
,
x
2
,
x
3
,
x
4
,
x
5
,
x
6
,
x
7
}
\mathbf{U} = \{x_1, x_2, x_3, x_4, x_5, x_6, x_7\}
U={x1,x2,x3,x4,x5,x6,x7},条件属性集合
C
=
{
Headache, Temperature, Lymphocyte , Leukocyte, Eosinophil
}
\mathbf{C} = \{\textrm{Headache, Temperature, Lymphocyte , Leukocyte, Eosinophil}\}
C={Headache, Temperature, Lymphocyte , Leukocyte, Eosinophil},决策属性集合
D
=
{
H
e
a
r
t
b
e
a
t
,
F
l
u
}
\mathbf{D} = \{\rm Heartbeat, Flu\}
D={Heartbeat,Flu},值域
V
=
{
N
o
,
Y
e
s
,
H
i
g
h
,
N
o
r
m
a
l
,
L
o
w
,
A
b
n
o
r
m
a
l
}
\mathbf{V} = \{\rm No,Yes, High, Normal, Low, Abnormal\}
V={No,Yes,High,Normal,Low,Abnormal},函数
I
I
I 可以由上图的二维表进行表示。
2. 定义一个标签分布系统, 即各标签的值不是 0/1, 而是 [0,1] 区间的实数, 且同一对象的标签和为 1.
答: A label distribution system is a tuple S = ( X , Y ) S = (\mathbf{X, Y}) S=(X,Y) where X = [ x i j ] n × m ∈ R n × m \mathbf{X} = [x_{ij}]_{n \times m} \in \mathbb{R}^{n \times m} X=[xij]n×m∈Rn×m is a data matrix, Y = [ y i k ] n × l ∈ [ 0 , 1 ] n × l \mathbf{Y} = [y_{ik}]_{n \times l} \in [0, 1]^{n \times l} Y=[yik]n×l∈[0,1]n×l is the label matrix satisfying ∀ y i j ∈ Y \forall y_{ij} \in \mathbf{Y} ∀yij∈Y, ∑ k = 1 l y i k = 1 \sum_{k=1}^l y_{ik} = 1 ∑k=1lyik=1, where n n n is the number of instances, m m m is the number of features, and l l l is the number of labels.