RDF data

最新推荐文章于 2017-12-18 15:56:07 发布

diamondzdd

最新推荐文章于 2017-12-18 15:56:07 发布

阅读量506

点赞数

文章标签： graph variables processing search algorithm query

本文链接：https://blog.csdn.net/diamond_zengdadan/article/details/7357976

版权

KeyWord Search in RDF Graph

KeyWords as elements of structured queries

Instead of presenting the top-k answers, which might actually belong to many distinct queries, we let the user select one of the top-k queries to retrieve all its answers.

Thus, the keyword search process contains an additional step, namely the presentation of structured queries.

Algorithms for subgraph exploration

A novel algorithm for the computation of the top-k subgraphs.

In current approaches, keywords are exclusively mapped to vertices. In order to connect the vertices corresponding to keywords, current algorithms aim at computing tree-shaped candidate networks or answer trees. Since keywords do not necessarily correspond to answers exclusively in our approach, they might also be mapped to edges. As a consequence, substructures connecting keyword elements are not restricted to trees, but can be graphs in general. Thus, algorithm as applied for tree search are not sufficient.

Efficient and complete Top-k through Graph summarization

Algorithms for top-k retrieve assume that the computed substructures connecting keyword elements represent trees with distinct roots. Since book-keeping the information required for top-k processing is difficult and expensive, existing top-k algorithms can not provide the guarantee that the results indeed top-k subgraphs, more complete data structures are introduced to keep track of the scores of all explored paths and of all remaining candidates.

A strategy for graph summarization is employed that can substantially reduce the search space. This means that the exploration of subgraphs does not operate on the entire data graph but a summary containing only the elements that are necessary to compute the queries.

Problem definition / Graph Definition(data structure and store)

Data for both translation of Qu to Qs and the actual processing of Qs, we make use of the data graph G, a RDF data model containing triples.

Definition : A data graph G is a tuple (V,L,E) where

• V is a finite set of vertices. Thereby, V is conceived as the disjoint union Ve Vc Vv with E-vertices Ve(representing entities), C-vertices Vc(classes), and V-vertices Vv(data value)

• L is a finite set of edge labels, subdivided by L = Lr La {type, subclass}, where Lr represents inter-entity edges and La stands for entity-attribute assignments.

• E is a finite set of edges of the form e(v1,v2) with v1,v2 V and e L. moreover, the following restrictions apply:

- e Lr if and only if v1, v2 Ve

- e La if and only if v1 Ve and v2 Vv

- e = type if and only if v1 Ve and v2 Vc and

- e = subclass if and only if v1,v2 Vc

The two predefined types of edges, i.e. type and subclass, have a special interpretation. The former captures the class membership of an entity and the latter is used to define the class hierarchy. Vertices corresponding to entities are identified by specific IDs, which in the case of RDF data are so called uniform resource identifiers (URIs). As identifiers for other elements we user class names, property names, attribute names and values, respectively.

Technically, RDF data is often stored in a relational database. For instance, exactly one relational table of three columns can be sued to store entities’ properties and attributes(such as jena, sesame or oracle). In this paper, , the RDf graph is translated to data of the table.

Query:

The user query Qu is a set of keywords(k1,k2,…ki). The system queries Qs are conjunctive queries defined as ,

Definition: a conjunctive query is an expression of the form (x1,x2…,xk)

Where x1,..xk are called distinguished variables, those which will be bound to yield an answer, xk+1,…xm are undistinguished variables and A1…Ar are query atoms. These atoms are of the form P(v1,v2) where P is called predicate, v1 v2 are variable or constants.

(additional : the difference between a class and an entity type is that classes have both data and behavior whereas entity types just have data. )

Data Generator

UBA 1.7

Execute following commands under the unzipped directory of the classes : java edu.lehigh.swat.bench.uba –“存放数据的文件夹”

Fifteen .owl files I get.