1. MLE
假设简单的贝叶斯网络模型,根据A,B的观测值,估计CPT(条件概率表);
如果采用最大似然估计,分成几步:
state_counts = self.state_counts(node)
# if a column contains only `0`s (no states observed for some configuration
# of parents' states) fill that column uniformly instead
state_counts.loc[:, (state_counts == 0).all()] = 1
parents = sorted(self.model.get_parents(node))
parents_cardinalities = [len(self.state_names[parent]) for parent in parents]
node_cardinality = len(self.state_names[node])
# Get the state names for the CPD
state_names = {node: list(state_counts.index)}
if parents:
state_names.update(
{
state_counts.columns.names[i]: list(state_counts.columns.levels[i])
for i in range(len(parents))
}
)
cpd = TabularCPD(
node,
node_cardinality,
np.array(state_counts),
evidence=parents,
evidence_card=parents_cardinalities,
state_names={var: self.state_names[var] for var in chain([node], parents)},
)
cpd.normalize()
return cpd
(1)估计出现的各种组合的频率,比如,
分别列出A,B的取值组合下, C的取值次数。
(2)归一化取值:
取值不是概率,因此,把取值范围归一化到0-1区间:
参考:
- MLE;