Graph Definition
Data graph
A data graph is a directed graph G=(V, E, fA,fC), where (1)V is a finite set of nodes;(2)E∈V*V is a finite set of edges, in which (v,v’) denotes an edge from node v to v’; (3)fA is a function defined on V such that for each node v in V, fA(v) is a tuple (A1=a1,….,An=an), where Ai=ai(i∈[1..n]), representing that the node v has a constant value ai for the attribute Ai, and denote as v.Ai=ai;and (4)fc is a function defined on E such that for each edge e in E, fc(e) is a color symbol in a finite alphabet ∑.
RDF Data graph
Definition 1: A data graph G is a tuple (V,L,E) where
• V is a finite set of vertices. Thereby, V is conceived
as the disjoint union VE _ VC _ VV with E-vertices VE
(representing entities), C-vertices VC (classes), and Vvertices
VV (data values).
• L is a finite set of edge labels, subdivided by L = LR _
LA _{type, subclass}, where LR represents inter-entity
edges and LA stands for entity-attribute assignments.
• E is a finite set of edges of the form e(v1, v2) with
v1, v2 ∈ V and e ∈ L. Moreover, the following restrictions
apply:
– e ∈ LR if and only if v1, v2 ∈ VE,
– e ∈ LA if and only if v1 ∈ VE and v2 ∈ VV ,
– e = type if and only if v1 ∈ VE and v2 ∈ VC, and
– e = subclass if and only if v1, v2 ∈ VC.
Probabilistic RDF data Graph
A probabilistic RDF data graph G∈g is represented by a triple(V(G),E(G),S(G)). Here , we have
•V(G) is a finite set of vertices vi with possible labels l(vi);
•E(G) is a finite set of directed edges eij with labels l(eij);
•S(G) is a finite set of conditional probability labels(CPL) ,T(vi\pa(vi)), associated with vertices vi∈G, which describe probabilities that vi take labels l(vi), given that vertices vj∈pa(vi) take some labels l(vj),where pa(vi) contains parent vertices that point to vi via directed edges.
KeyWord query based RDF Query
Definition 2: A conjunctive query is an expression of
the form (x1, . . . , xk).∃xk+1, . . . xm.A1 ∧ . . . ∧ Ar, where
x1, . . . , xk are called distinguished variables (those which will
be bound to yield an answer), xk+1, . . . , xm are undistinguished
variables (existentially quantified) and A1, . . .,Ar are
query atoms. These atoms are of the form P(v1, v2), where
P is called predicate, v1, v2 are variables or constants.
RDF query mapping in Data Graph
Definition 3: Given a data graph G = (V,L,E) and a
conjunctive query q, let V ard (resp. V aru) denote the set of
distinguished (resp. undistinguished) variables occurring in q.
Then a mapping μ : V ard → V from the querys distinguished
variables to the vertices of G will be called an answer to q,
if there is a mapping ν : V aru → V from q’s undistinguished
variables to the vertices of G such that the function
μ_ : V ard ∪ V aru ∪ V → V
v → μ(v) if v ∈ V ard
v → ν(v) if v ∈ V aru
v → v if v ∈ V
satisfies P(μ_(v1), μ_(v2)) ∈ E for any P(v1, v2) in q.
Graph pattern query
Using RQs as building blocks, we next define graph pattern queries.
A graph pattern query(PQ) is a directed graph Qp = (Vp,Ep,fv,fe),where (1) Vp is a finite set of nodes;(2)Ep∈Vp*Vp is a finite set of edges, where (u,u’) denotes an edge from node u to u’; and (3) the functions fv and fe are defined on Vp and Ep, respectively, such that for each edge e=(u,u’) ∈Ep, Qr=(u,u’,fv(u), fv(u’), fe(e)) is an RQ.