目录:
一、Phoenix安装
二、Phoenix安装及连接Hbase
三、phoenix 配置
一、Phoenix安装
1、首先去官网下载Phoenix的压缩包
下载地址:http://mirror.bit.edu.cn/apache/phoenix/
由于我们系统的Hbase版本是1.1版本的,故下载4.7版本的Phoenix。
2、解压缩Phoenix的压缩包
tar –zxvf phoenix-4.7.0-HBase-1.1-bin.tar.gz
3、将phoenix-4.7.0-HBase-1.1-bin/目录下phoenix-.jar包复制到hbase的lib目录下
cp phoenix-.jar $HBASE_HOME/lib
4、重启Hbase
$HBASE_HOME/bin/stop-hbase.sh
$HBASE_HOME/bin/start-hbase.sh
二、Phoenix安装及连接Hbase
1、在IDE中pom.xml文件中的配置
<dependency>
<groupId>org.apache.phoenix</groupId>
<artifactId>phoenix-spark</artifactId>
<version>4.7.0-HBase-1.1</version>
<scope>provided</scope>
</dependency>
2、在服务器上每台spark机器上的spark-defaults.conf文件中的配置(在Phoenix4.7或以后的版本用phoenix-4.7.0-HBase-1.1-client-spark.jar,而在之前用phoenix-4.7.0-HBase-1.1-client.jar)
spark.driver.extraClassPath /spark/phoenix-client/lib/phoenix-4.7.0-HBase-1.1-client-spark.jar:/spark/phoenix-client/lib/libthrift-0.9.0.jar
spark.executor.extraClassPath /spark/phoenix-client/lib/phoenix-4.7.0-HBase-1.1-client-spark.jar:/spark/phoenix-client/lib/libthrift-0.9.0.jar
注意:官网上只说明需要将phoenix-4.7.0-HBase-1.1-client-spark.jar包导入,而实际工作环境中需要将libthrift-0.9.0.jar包导入,否则会报找不到包的错
三、phoenix 配置
http://archive-primary.cloudera.com/cloudera-labs/
1、phoenix和spark整合:
在spark-conf/spark-env.sh中配置
SPARK_DIST_CLASSPATH="$SPARK_DIST_CLASSPATH:/opt/cloudera/parcels/CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000/lib/phoenix/phoenix-4.7.0-clabs-phoenix1.3.0-client.jar"
export SPARK_YARN_USER_ENV="CLASSPATH=$HADOOP_CONF_DIR"
2、phoenix 二级索引需要的配置
<property><name>hbase.regionserver.wal.codec</name><value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value></property>
3、phoenix 4.8 中使用 schema 和 Namespace 对应的配置
<property><name>phoenix.schema.isNamespaceMappingEnabled</name><value>true</value></property>
What is namespace and benefits of mapping table to namespace?
A namespace is a logical grouping of tables analogous to a database in relation database systems. This abstraction lays the groundwork for upcoming multi-tenancy related features:
Quota Management - Restrict the amount of resources (i.e. regions, tables) a namespace can consume.
Namespace Security Administration - Provide another level of security administration for tenants.
Region server groups - A namespace/table can be pinned onto a subset of RegionServers thus guaranteeing a course level of isolation.
注意:cloudera manager 目前phoenix最新版本是4.7(不支持此功能)
http://blog.cloudera.com/blog/2015/11/new-apache-phoenix-4-5-2-package-from-cloudera-labs/