from:http://blog.csdn.net/silentwolfyh
The GraphComputer
BulkLoaderVertexProgram
———————————————————————————-
The GraphComputer
TinkerPop3提供了两种与图形交互的主要方式:在线事务处理(OLTP)和在线分析处理(OLAP)
OLTP:(流式处理)基于OLTP绘制的图形系统允许用户实时查询图表。但是,通常情况下,只有在本地遍历的情况下才可能实现实时性能。局部遍历是从一个特定的顶点(或小点的顶点)开始的,并触及一组连接的顶点(通过任意长度的任意路径)。简言之,OLTP查询与一组有限的数据交互,并以毫秒或秒的顺序响应。
OLTP :(批处理)另一方面,使用OLAP图形处理,整个图形被处理,因此,每个顶点和边都被分析(迭代的,递归算法的次数不止一次)。由于处理的数据量很大,所以结果通常不会实时返回,而对于大量的图形(例如,在一个计算机集群中表示的图表),结果可以按分钟或小时的顺序进行。
BulkLoaderVertexProgram
BulkLoaderVertexProgram 提供了一种通用的方法,可以将任何大小的图形加载到一个持久的图形中。它对大图形特别有用,因为它可以利用graph计算机实例所提供的并行处理。输入可以是任何支持TinkerPop3或任何Hadoop graphinputformat的现有图形数据库。下面的例子演示了如何从一个TinkerGraph加载数据到另一个TinkerGraph。
官网例子
gremlin> writeGraphConf = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@18c880ea
gremlin> writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph")
gremlin> writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo")
gremlin> writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/tinkergraph.kryo")
gremlin> modern = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> blvp = BulkLoaderVertexProgram.build().
bulkLoader(OneTimeBulkLoader).
writeGraph(writeGraphConf).create(modern)
==>BulkLoaderVertexProgram[bulkLoader=OneTimeBulkLoader, vertexIdProperty=null, userSuppliedIds=false, keepOriginalIds=false, batchSize=0]
gremlin> modern.compute().workers(1).program(blvp).submit().get()
==>result[tinkergraph[vertices:6 edges:6],memory[size:0]]
gremlin> graph = GraphFactory.open(writeGraphConf)
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().valueMap()
==>[name:[marko],age:[29]]
==>[name:[vadas],age:[27]]
==>[name:[lop],lang:[java]]
==>[name:[josh],age:[32]]
==>[name:[ripple],lang:[java]]
==>[name:[peter],age:[35]]
gremlin> graph.close()
package com.xiaohui.thegraphcomputer;
import java.util.Map;
import org.apache.commons.configuration.BaseConfiguration;
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory;
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph;
import org.apache.tinkerpop.gremlin.process.computer.bulkloading.BulkLoaderVertexProgram;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Graph;
import org.apache.tinkerpop.gremlin.structure.util.GraphFactory;
/***
* 这个例子分为三步:
1. 通过TinkerFactory.createModern()方法将Graph建立好,
代码路径为:org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory.createModern()
2. 将建立好的Graph保存到本地
3. 将本地保存的图重新读取进来,D:\\tinkergraph.kryo 文件 ,通过traversal进行遍历
* @author yuhui
*
*/
public class bulkLoaderVertexProgram {
public static void main(String[] args) {
try {
BaseConfiguration writeGraphConf = new BaseConfiguration();
writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph");
writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo");
writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "D:\\tinkergraph.kryo");
//1. 通过TinkerFactory.createModern()方法将Graph建立好,
// 代码路径为:org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory.createModern()
TinkerGraph modern = TinkerFactory.createModern();
//2. 将建立好的Graph保存到本地
BulkLoaderVertexProgram blvp = BulkLoaderVertexProgram.build()
.bulkLoader("org.apache.tinkerpop.gremlin.process.computer.bulkloading.OneTimeBulkLoader")
.writeGraph(writeGraphConf).create(modern);
modern.compute().workers(1).program(blvp).submit().get();
//3. 将本地保存的图重新读取进来,D:\\tinkergraph.kryo 文件 ,通过traversal进行遍历
Graph graph = GraphFactory.open(writeGraphConf);
GraphTraversalSource g = graph.traversal();
for(Map<String, Object> list : g.V().valueMap().toList()){
System.out.println(list.toString());
}
graph.close();
} catch (Exception e) {
// TODO: handle exception
}
}
}
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cn.yh.janusgraph</groupId>
<artifactId>JanusGraph_Project</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>JanusGraph_Project</name>
<dependencies>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-core</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-core</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-driver</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>tinkergraph-gremlin</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>neo4j-gremlin</artifactId>
<version>3.3.0</version>
</dependency>
<!-- neo4j-tinkerpop-api-impl is NOT Apache 2 licensed - more information below -->
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-tinkerpop-api-impl</artifactId>
<version>0.3-2.3.3</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>hadoop-gremlin</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>spark-gremlin</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>giraph-gremlin</artifactId>
<version>3.3.0</version>
</dependency>
</dependencies>
</project>