官网
github
https://github.com/pgmpy/pgmpy#installation
三种方法安装
Using conda:
$ conda install -c ankurankan pgmpy
Using pip:
$ pip install -r requirements.txt # or requirements-dev.txt if you want to run unittests
$ pip install pgmpy
Or for installing the latest codebase:
$ git clone https://github.com/pgmpy/pgmpy
$ cd pgmpy/
$ pip install -r requirements.txt
$ python setup.py install
文档阅读
例子01
###创建模型代码
# coding: utf-8
# In[16]:
# Starting with defining the network structure
from pgmpy.models import BayesianModel
cancer_model = BayesianModel([('Pollution', 'Cancer'),
('Smoker', 'Cancer'),
('Cancer', 'Xray'),
('Cancer', 'Dyspnoea')])
# In[17]:
# Now defining the parameters.
from pgmpy.factors.discrete import TabularCPD
cpd_poll = TabularCPD(variable='Pollution', variable_card=2,
values=[[0.9], [0.1]])
cpd_smoke = TabularCPD(variable='Smoker', variable_card=2,
values=[[0.3], [0.7]])
cpd_cancer = TabularCPD(variable='Cancer', variable_card=2,
values=[[0.03, 0.05, 0.001, 0.02],
[0.97, 0.95, 0.999, 0.98]],
evidence=['Smoker', 'Pollution'],
evidence_card=[2, 2])
cpd_xray = TabularCPD(variable='Xray', variable_card=2,
values=[[0.9, 0.2], [0.1, 0.8]],
evidence=['Cancer'], evidence_card=[2])
cpd_dysp = TabularCPD(variable='Dyspnoea', variable_card=2,
values=[[0.65, 0.3], [0.35, 0.7]],
evidence=['Cancer'], evidence_card=[2])
# In[18]:
# Associating the parameters with the model structure.
cancer_model.add_cpds(cpd_poll, cpd_smoke, cpd_cancer, cpd_xray, cpd_dysp)
# Checking if the cpds are valid for the model.
cancer_model.check_model()
# In[19]:
cancer_model.get_independencies()
例02
#Bayesian Estimator
In [2]:
>>> import pandas as pd
>>> from pgmpy.models import BayesianModel
>>> from pgmpy.estimators import BayesianEstimator
>>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]})
data.head()
Out[2]:
A B C
0 0 0 1
1 0 1 1
2 1 0 0
In [3]:
>>> model = BayesianModel([('A', 'C'), ('B', 'C')])
>>> estimator = BayesianEstimator(model, data)
>>> cpd_C = estimator.estimate_cpd('C', prior_type="dirichlet", pseudo_counts=[1, 2])
>>> print(cpd_C)
╒══════╤══════╤══════╤══════╤════════════════════╕
│ A │ A(0) │ A(0) │ A(1) │ A(1) │
├──────┼──────┼──────┼──────┼────────────────────┤
│ B │ B(0) │ B(1) │ B(0) │ B(1) │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(0) │ 0.25 │ 0.25 │ 0.5 │ 0.3333333333333333 │
├──────┼──────┼──────┼──────┼────────────────────┤
│ C(1) │ 0.75 │ 0.75 │ 0.5 │ 0.6666666666666666 │
╘══════╧══════╧══════╧══════╧════════════════════╛
In [4]:
print(estimator.get_parameters(prior_type='BDeu', equivalent_sample_size=5))
[<TabularCPD representing P(A:2) at 0x7f86a42021d0>, <TabularCPD representing P(C:2 | A:2, B:2) at 0x7f86a4202940>, <TabularCPD representing P(B:2) at 0x7f86a42026d8>]
错误记录
在使用BeliefPropagation时,使用query过程中,出现以下的报错,原因是networkx版本问题,这个可查看github的
requirements.txt:
networkx==1.11
numpy==1.11.3
scipy==0.18.1
pandas==0.19.2
pyparsing==2.2
wrapt==1.10.8