geomesa-hbase实践
索引概念:geomesa会根据的schema字段属性来创建不同的时间与空间的索引
- Z2[z2]]索引使用的二维Z阶曲线来索引点数据的维度和经度,用于查询空间上的分布没有时间概念
- Z3[z3]索引使用的是三维Z阶曲线来索引点数据的维度经度和时间,具有空间和时间的概念
- XZ2[xz2]索引使用的是XZ-ordering的二维实现来索引非点数据的维度和经度
- XZ3[xz3]索引使用的是XZ-ordering的三维实现来索引非点数据的纬度经度和时间
- Record/ID[id]记录索引使用功能的ID主键,它用于ID的任何查询
- Attribute[attr]属性索引使用的属性值作为主索引键,注意当需要按某个属性查询时该属性需要提前声明(index=true)
schema字段属性参考:https://www.geomesa.org/documentation/user/datastores/attributes.html?highlight=multilinestring
实现
创建schema
- 实现代码
public class CommandDataStore {
public Options createOptions(Param[] parameters) {
Options options = new Options();
for (Param p: parameters) {
if (!p.isDeprecated()) {
Option opt = Option.builder(null)
.longOpt(p.getName())
.argName(p.getName())
.hasArg()
.desc(p.getDescription().toString())
.required(p.isRequired())
.build();
options.addOption(opt);
}
}
return options;
}
public CommandLine parseArgs(Class<?> caller, Options options, String[] args) throws ParseException {
try {
return new DefaultParser().parse(options, args);
} catch (ParseException e) {
System.err.println(e.getMessage());
HelpFormatter formatter = new HelpFormatter();
formatter.printHelp(caller.getName(), options);
throw e;
}
}
public Map<String, String> getDataStoreParams(String catalog) throws ParseException {
String[] args ={"--hbase.catalog", catalog, "--hbase.zookeepers", "localhost:2181"};
Options options = this.createOptions(new HBaseDataStoreFactory().getParametersInfo());
CommandLine command = this.parseArgs(getClass(), options, args);
Map<String, String> params = new HashMap<>();
// noinspection unchecked
for (Option opt: options.getOptions()) {
String value = command.getOptionValue(opt.getLongOpt());
if (value != null) {
params.put(opt.getArgName(), value);
}
}
return params;
}
public DataStore createHbaseDataStore(String catalog) throws IOException, ParseException {
Map<String, String> params = this.getDataStoreParams(catalog);
return DataStoreFinder.getDataStore(params);
}
public static void main(String[] args){
CommandDataStore comm = new CommandDataStore();
try {
DataStore dataStore = comm.createHbaseDataStore("demo");
String spec = "fid:String:index=true,name:String:index=true,type:Integer:index=true," +
"telephone:String,adminCode:Integer:index=true,address:String,*geom:Point:srid=4326";
SimpleFeatureType sft = SimpleFeatureTypes.createType("demo", spec);
dataStore.createSchema(sft);
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
}
- 可以在hbase看到创建结果:z2_v2就是只有空间没有时间的索引
geomesa写入数据
这里用csv格式的文件来作为插入数据的例子,geomesa还支持很多种导入数据格式tsv,geojson,shp等
/**
解析文件
**/
public List<SimpleFeature> getTestData() {
List<SimpleFeature> features = new ArrayList<>();
InputStreamReader isr = null;
String spec = "fid:String:index=true,name:String:index=true,type:Integer:index=true,telephone:String,adminCode:Integer:index=true,address:String,*geom:Point:srid=4326";
SimpleFeatureType sft = SimpleFeatureTypes.createType("demo", spec);
SimpleFeatureBuilder builder = new SimpleFeatureBuilder(sft );
// use apache commons-csv to parse the GDELT file
try {
isr = new InputStreamReader(new FileInputStream(getFilePath()), "GBK");
CSVParser parser = CSVParser.parse(isr, CSVFormat.RFC4180);
for (CSVRecord record : parser) {
System.out.println(record.toString());
try {
builder.set("fid", getFileCode());
builder.set("name", record.get(1));
builder.set("type", record.get(3));
builder.set("telephone", record.get(4).replace("|",";"));
builder.set("adminCode", record.get(5));
builder.set("address", record.get(8));
double latitude = Double.parseDouble(record.get(10));
double longitude = Double.parseDouble(record.get(9));
builder.set("geom", "POINT (" + longitude + " " + latitude + ")");
builder.featureUserData(Hints.USE_PROVIDED_FID, java.lang.Boolean.TRUE);
SimpleFeature feature = builder.buildFeature(RandomHelper.uuid());
features.add(feature);
} catch (Exception e) {
logger.debug("Invalid GDELT record: " + e.toString() + " " + record.toString());
e.printStackTrace();
}
}
} catch (IOException e) {
throw new RuntimeException("Error reading GDELT data:", e);
}finally {
if (isr != null) {
try {
isr.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return features;
}
/**
插入数据
**/
public void writeFeatures(DataStore datastore, SimpleFeatureType sft, List<SimpleFeature> features) throws IOException {
if (features.size() > 0) {
System.out.println("Writing test data");
try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer =
datastore.getFeatureWriterAppend(sft.getTypeName(), Transaction.AUTO_COMMIT)) {
for (SimpleFeature feature : features) {
try {
SimpleFeature toWrite = writer.next();
toWrite.setAttributes(feature.getAttributes());
((FeatureIdImpl) toWrite.getIdentifier()).setID(feature.getID());
toWrite.getUserData().put(Hints.USE_PROVIDED_FID, Boolean.TRUE);
toWrite.getUserData().putAll(feature.getUserData());
writer.write();
}catch (Exception e){
logger.debug("Invalid GDELT record: " + e.toString() + " " + feature.getAttributes());
}
}
}
}
}
值得注意的是geojson和shp文件geotools都提供通用解析方式可以很方便的导入
geomesa空间相关的查询
- CONTAINS(ATTR1, POINT(1 2))
- BBOX(ATTR1, 10,20,30,40)
- DWITHIN(ATTR1, POINT(1 2), 10, kilometers)
- CROSS(ATTR1, LINESTRING(1 2, 10 15))
- INTERSECT(ATTR1, GEOMETRYCOLLECTION (POINT (10 10),POINT (30 30),LINESTRING (15 15, 20 20)) )
- CROSSES(ATTR1, LINESTRING(1 2, 10 15))
- INTERSECTS(ATTR1, GEOMETRYCOLLECTION (POINT (10 10),POINT (30 30),LINESTRING (15 15, 20 20)) )
- 参考:http://docs.geotools.org/latest/userguide/library/cql/cql.html
查询所有数据
public static void main(String[] args){
CommandDataStore comm = new CommandDataStore();
try {
DataStore shapes = comm.createHbaseDataStore("demo");
SimpleFeatureCollection outerFeatures = shapes.getFeatureSource("demo").getFeatures();
SimpleFeatureIterator iterator = outerFeatures.features();
while (iterator.hasNext()) {
SimpleFeature feature = iterator.next();
feature.getProperties().forEach( a ->
System.out.println(a.getName().toString() + " " + a.getValue().toString())
);
}
} catch (IOException e) {
e.printStackTrace();
} catch (ParseException e) {
e.printStackTrace();
}
}
- 使用WITHIN空间查询
String typeName = schema.getTypeName();
//获取geomesa默认的schema空间字段
String geom = schema.getGeometryDescriptor().getLocalName();
//查询点是否包含在POLYGON面里面
//geom为schema字段
Query query = new Query("demo",
ECQL.toFilter(String.format("WITHIN(geom, POLYGON ((%s)))", "1 1 2 2 3 3 1 1")));
FeatureReader<SimpleFeatureType, SimpleFeature> reader =
datastore.getFeatureReader(query, Transaction.AUTO_COMMIT);
while (reader .hasNext()) {
SimpleFeature feature = reader .next();
feature.getProperties().forEach( a ->
System.out.println(a.getName().toString() + " " + a.getValue().toString())
);
}
这里需要注意的是两种读取的api不一样,geotools提供了很多读取方式大家可以任意选择