IEEE Transactions on knowledge and data engineering 2017/09
Graph
- node classification
- node clustering
- node retrieval & recommendation
- link prediction
problems:
- computation
- space
Graph Embedding
graph classifications:
- homogeneous graph
a homogeneous graph G=(V,E) is a graph in which |Tv|=|Te|=1 . All nodes in G belong to a single type and all edges belong to one single type
- heterogeneous graph
a heterogeneous graph G=(V,E) is a graph in which |Tv|>1 and / or |Te|>1
- attribute graph
graph embedding
- parts of the graph
- whole graph
input graph
- homogeneous graph
- heterogeneous graph
- graph with auxiliary information
- graph constructed from non-relational data
output graph
- node embedding
- edge embedding
- hybrid embedding
- whole-graph embedding
IEEE Transactions on Pattern Analysis And Machine Intelligence 2017/05
python library: GEM (Graph Embedding Methods)
https://github.com/palash1992/GEM
challenges
- choice of property
- scalability
- dimensionality of the embedding
methods
- factorization
- random walk
- deep learning
applications
- network compression
- visualization
- clustering
- link prediction
- node classification
future work
- exploring non-linear models
- studying evolution of networks
- generate synthetic networks with real-world characteristics
Advances in Neural Information Processing Systems 26 (NIPS 2013)
TransE
Multi Relational Data
Proceedings of The Twenty-Eighth AAAI Conference on Artificial Intelligence
TransH
interprets a relation relation as a translating operation on a hyperplane, each relation is characterized by two vectors,
the norm vector ( Wr ) of the hyperplane and the translation vector ( dr ) on the hyperplane
Reducing False Negative labels
give more chance to replacing the head entity if the relation is 1-N
give more chance to replacing the tail entity if the relation is N-1
Relation Fact Extraction
text side extraction model
knowledge graph embedding
Proceedings of The Twenty-Ninth AAAI Conference on Artificial Intelligence
TransR CTransR
TransE: 1-1
fr(h,t)=||h+r−t||22
TransH: multi
project head entities and tail entities to a hyper-plane
fr(h,t)=||h⊥+r−t⊥||22
Unstructured Model
Structured Embedding
Single Layer Model
Latent Factor Model
Neural Tensor Network
TransR:
entities and relations in distinct spaces
projection matrix Mr∈Rk∗d
hr=hMr
tr=tMr
fr(h,t)=||hr+r−tr||22
CTransR:
piecewise linear regression
hr,c=hMr
tr,c=tMr
fr(h,t)=||hr,c+rc−tr,c||22+α||rc−r||22
Score Function
L=∑(h,r,t)∈S∑(h′,r,t′)∈S′max(0,fr(h,t)+γ−fr(h′,t′))
γ is the margin
S is the set of correct triples andS′ is the set of incorrect triples
learning process
SGD: stochastic gradient descent
Link Prediction
replace head/tail entities by all entities and rank in descending
1> mean rank of correct entities
2> proportion of correct entities in top-10 ranked entities
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing
TransD
each named symbol object (entities and relations ) is represented by two vectors. The first one captures the meaning of entity ( relation ), the other one is used to construct mapping
Mapping matrices:
Mrh=rph⊤p+Im∗n
Mrt=rpt⊤p+Im∗n
Project vector:
h⊥=Mrhh
t⊥=Mrth
Store function:
fr(h,t)=−||h⊥+r−t⊥||22
贝叶斯
参数学习(已知网络结构)
- 最大似然估计
- 贝叶斯估计
结构学习(未知网络结构)
贝叶斯网络
-
贝叶斯定理
-
H=h为一组假设,E=e为一组证据。P(H=h)为先验概率,P(H=h|E=e)为后验概率
P(H=h|E=e)=P(H=h)P(E=e|H=h)P(E=e)
-
贝叶斯网
-
节点代表随机变量,节点间的边代表变量间的直接依赖关系。每个节点都附有一个概率分布,根节点X所附的是他的边缘分布 P(X) ,非根节点X所附的是他的条件概率分布 P(X|π(Xi))
贝叶斯网是联合概率分布的分解的一种表示。
不同的变量顺序会导致不同的贝叶斯网络结构,贝叶斯网络构造的复杂度不同,如何选取合适的随机变量顺序来构造贝叶斯网络?
- 利用因果关系(因果马尔可夫假设):吸烟(S)——>肺癌(L)
- 模型复杂度为标准
- 条件概率评估的难易程度为标准
减少参数的方法:
- 因果独立机制
- 环境独立
贝叶斯网应用
- 医疗诊断
- 工业故障诊断
- 金融分析
- 计算机:垃圾邮件过滤、故障诊断、机器学习、编码学…
- 军事应用:目标识别、训练仿真…
- 生态学
- 农牧业
朴素贝叶斯模型
局部独立:类别变量C与个属性变量Ai相互独立
加树贝叶斯模型(TAN模型)
属性变量Ai的父节点π(Ai)不仅包括类别变量C,也可能包括其他属性变量
动态贝叶斯网
动态贝叶斯网络(β0,β→),β0代表了一个标准贝叶斯网,
定义了两个相邻时间片的各变量之间的条件分布,即
P(Zt|Zt−1)=∏i=1nP(Zit|π(Zit)
其中Zit是位于时间t的节点i,π(Zit)是Zit的父节点。
β→中前一个时间片中的节点可以不给出参数,第二个时间片中的每个节点都有一个条件概率分布P(Zit|π(Zit),t>0.隐马尔可夫模型
卡尔曼滤波器
图分隔与变量独立
顺连、分连、汇连