hbase(coprocessor)-es构建二级索引(完整版)

本文介绍了如何利用HBase的RegionObserver协处理器与Elasticsearch(ES)整合,构建二级索引。内容涵盖RegionObserver的作用、Endpoint Coprocessor的特性,以及在HBase 0.96后版本中coprocessor的变化。详细阐述了RegionObserver的代码实现、协处理器加载方法,包括配置文件、shell命令和API加载。同时,文章还讨论了加载失败的处理策略,如调整配置或卸载重试,并提到了在协处理器中创建ES索引库的步骤。
摘要由CSDN通过智能技术生成

hbase(coprocessor)整合es构建二级索引

一.简介

HBase包含两种协处理器:Observers和Endpoint

1.RegionObserver:

eg:可以在客户端进行get操作时,通过preGet进行权限控制

//主要方法:
preOpen, postOpen: Called before and after the region is reported as online to the master.

preFlush, postFlush: Called before and after the memstore is flushed into a new store file.

preGet, postGet: Called before and after a client makes a Get request.

preExists, postExists: Called before and after the client tests for existence using a Get.

prePut and postPut: Called before and after the client stores a value.

preDelete and postDelete: Called before and after the client deletes a value.
2.WALObserver

提供基于WAL的写和刷新WAL文件的操作,一个regionserver上只有一个WAL的上下文。

preWALWrite/postWALWrite: called before and after a WALEdit written to WAL.
3.MasterObserver:

提供基于诸如ddl的的操作检查,如create, delete, modify table等,同样的当客户端delete表的时候通过逻辑检查时候具有此权限场景等。其运行于Master进程中。

preCreateTable/postCreateTable: Called before and after the region is reported as online to the master.

preDeleteTable/postDeleteTable
4.Endpoint Coprocessor:

Endpoint processors allow you to perform computation at the location of the data. An example is the need to calculate a running average or summation for an entire table which spans hundreds of regions.

In contrast to observer coprocessors, where your code is run transparently, endpoint coprocessors must be explicitly invoked using the CoprocessorService() method available in Table or HTable.

Endpoint Coprocessor需要结合客户端代码进行rpc通信来实现数据的搜集归并。而observer coprocessor只会在server端运行,且仅在特定操作后触发相应的代码。

Starting with HBase 0.96, endpoint coprocessors are implemented using Google Protocol Buffers (protobuf). For more details on protobuf, see Google’s Protocol Buffer Guide. Endpoints Coprocessor written in version 0.94 are not compatible with version 0.96 or later. See HBASE-5448). To upgrade your HBase cluster from 0.94 or earlier to 0.96 or later, you need to reimplement your coprocessor.

HBase 0.94更新到0.96之后的版本,coprocessor也发生了改变(0.96采用了protobuf)。

思考:10亿数据求top10000

二.RegionObserver的代码实现:

package myAPI3;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.CoprocessorEnvironment;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Durability;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.coprocessor.BaseRegionObserver;
import org.apache.hadoop.hbase.coprocessor.ObserverContext;
import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;
import org.apache.hadoop.hbase.regionserver.wal.WALEdit;
import org.apache.hadoop.hbase.util.Bytes;
import org.elasticsearch.client.Client;


import java.io.IOException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;


public class DataSyncObserver extends BaseRegionObserver {
   

   private static Client client = null;
   private static final Log LOG = LogFactory.getLog(DataSyncObserver.class);


   /**
    * 读取HBase Shell的指令参数
    *
    * @param env
    */
   private void readConfiguration(CoprocessorEnvironment env) {
   
       Configuration conf = env.getConfiguration();
       Config.clusterName = conf.get("es_cluster");
       Config.nodeHost = conf.get("es_host");
       Config.nodePort = conf.getInt("es_port", -1);
       Config.indexName = conf.get("es_index");
       Config.typeName = conf.get("es_type");

       //LOG.info("observer -- started with config: " + Config.getInfo());
   }


   @Override
   public void start(CoprocessorEnvironment env) throws IOException {
   
       LOG.info("-----------------------------------starting-------------------------------------------------------------------------------------"
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值