1.使用maven编译phoenix
按照Phoenix4.6适配CDH5.4和整合phoenix4.6.0-HBase-1.0到cdh5…4.7 编译phoenix4.6源码 RegionServer 宕机的做法可以重新编译成功适配的5.4.x,但是我把pom.xml中的5.4.x替换为5.5.2却报错,我也不知道是什么原因。
Phoenix安装配置该网址是适配cdh5.5.1,已经很接近cdh5.5.2了,但还是需要将phoenix-for-cloudera-4.6-HBase-1.0-cdh5.5
目录下所有各子目录下的pom.xml中的cdh5.5.1替换成cdh5.5.2,在cmd命令操作符中进入phoenix-for-cloudera-4.6-HBase-1.0-cdh5.5
目录,执行mvn package -DskipTests
,执行完在cmd命令操作符中显示如下:
Phoenix下载地址:https://archive.apache.org/dist/phoenix/
编译成功后Phoenix的安装不能按照参考地址中的做法,反正我按那个操作遇到了问题
2.以下是我自己的安装过程
将maven重新编译好适配cdh5.5.2的C:\Users\Administrator\Desktop\Maven\Phoenix4.6
适配CDH5.5.2\phoenix-for-cloudera-4.6-HBase-1.0-cdh5.5\phoenix-assembly\target\phoenix-4.6.0-cdh5.5.2-all.tar.gz
(根据你自己的目录,这里是我自己用maven编译好的目录)
解压到hadoop用户目录下,在这之前得安装好zookeeper-3.4.5-cdh5.5.2(而不能用hbase自带的zookeeper)
[hadoop@h40 ~]$ tar -zxvf phoenix-4.6.0-cdh5.5.2-all.tar.gz
[hadoop@h40 ~]$ rm phoenix-4.6.0-cdh5.5.2-all.tar.gz
# 设置环境变量:
[hadoop@h40 ~]$ su - root
[root@h40 ~]# vi /etc/profile
export PHOENIX_HOME=/home/hadoop/phoenix-4.6.0-cdh5.5.2
export PHOENIX_CLASSPATH=$PHOENIX_HOME
export PATH=$PATH:$PHOENIX_HOME/bin
进入到phoenix的安装目录,找到 “phoenix-4.6.0-cdh5.5.2-server.jar” ,将这个 jar 包拷贝到集群中每个节点( 主节点也要拷贝 )的 hbase 的 lib 目录下:
[hadoop@h40 phoenix-4.6.0-cdh5.5.2]$ cp phoenix-4.6.0-cdh5.5.2-server.jar /home/hadoop/hbase-1.0.0-cdh5.5.2/lib/
[hadoop@h40 phoenix-4.6.0-cdh5.5.2]$ scp phoenix-4.6.0-cdh5.5.2-server.jar h72:/home/hadoop/hbase-1.0.0-cdh5.5.2/lib/
[hadoop@h40 phoenix-4.6.0-cdh5.5.2]$ scp phoenix-4.6.0-cdh5.5.2-server.jar h73:/home/hadoop/hbase-1.0.0-cdh5.5.2/lib/
(有的地方说是得把hbase中conf目录中的hbase-site.xml替换Phoenix安装目录下的hbase-site.xml,并且还得将hdfs-site.xml拷贝过去(http://www.aboutyun.com/forum.php?mod=viewthread&tid=17363&page=1&extra=#pid151683)或者修改Phoenix安装目录下的hbase-site.xml,我没有做这些操作也安装成功了。)
然后这时候我们就可以重新启动 hbase 了,进入到 phoenix 安装目录的 bin 下:
[hadoop@h40 ~]$ cd phoenix-4.6.0-cdh5.5.2/bin/
[hadoop@h40 bin]$ ./sqlline.py h71,h72,h73:2181
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:h71,h72,h73:2181 none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:h71,h72,h73:2181
17/04/17 17:21:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 4.6)
Driver: PhoenixEmbeddedDriver (version 4.6)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
85/85 (100%) Done
Done
sqlline version 1.1.8
0: jdbc:phoenix:h71,h72,h73:2181> !tables
+------------------------------------------+------------------------------------------+------------------------------------------+-------------------------------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE |
+------------------------------------------+------------------------------------------+------------------------------------------+-------------------------------------+
| | SYSTEM | CATALOG | SYSTEM TABLE |
| | SYSTEM | FUNCTION | SYSTEM TABLE |
| | SYSTEM | SEQUENCE | SYSTEM TABLE |
| | SYSTEM | STATS | SYSTEM TABLE |
+------------------------------------------+------------------------------------------+------------------------------------------+-------------------------------------+
后来发现创建索引的时候有问题:
参考地址:
apach hadoop2.6 集群利用Phoenix 4.6-hbase 批量导入并自动创建索引 (这篇博客中创建索引有错误,应将命令create index index_pupulation on population(city,state);
改为create index index_uspo on uspo(city,state);
)
整合phoenix4.6.0-HBase-1.0到cdh5…4.7 编译phoenix4.6源码 RegionServer 宕机
Phoenix(sql on hbase)简介 (感觉这篇文章更好点)
需要修改所有节点的hbase-site.xml,(是hbase安装目录下conf下的,而不是Phoenix安装目录下的)添加如下内容:
[hadoop@h40 phoenix-4.6.0-HBase-1.0-bin]$ vi /home/hadoop/hbase-1.0.0/conf/hbase-site.xml
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
(后来经试验只要这一个就可以,不知道后面的那些有什么作用,为了安全起见就都加上吧)
<property>
<name>hbase.region.server.rpc.scheduler.factory.class</name>
<value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
<name>hbase.rpc.controllerfactory.class</name>
<value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
<name>hbase.coprocessor.regionserver.classes</name>
<value>org.apache.hadoop.hbase.regionserver.LocalIndexMerger</value>
</property>
<property>
<name>hbase.master.loadbalancer.class</name>
<value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
</property>
否则在创建索引的时候会出现如下错误:
Error: ERROR 1029 (42Y88): Mutable secondary indexes must have the hbase.regionserver.wal.codec property set to org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec in the hbase-sites.xml of every region server tableName=INDEX_USPO (state=42Y88,code=1029)
java.sql.SQLException: ERROR 1029 (42Y88): Mutable secondary indexes must have the hbase.regionserver.wal.codec property set to org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec in the hbase-sites.xml of every region server tableName=INDEX_USPO
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:396)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.schema.MetaDataClient.createIndex(MetaDataClient.java:1162)
at org.apache.phoenix.compile.CreateIndexCompiler$1.execute(CreateIndexCompiler.java:95)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:322)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:314)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:312)
at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1435)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:808)
at sqlline.SqlLine.begin(SqlLine.java:681)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:292)
3.完成以上步骤后测试使用Phoenix
进入Phoenix命令操作界面:
0: jdbc:phoenix:h40,h41,h42:2181> CREATE TABLE IF NOT EXISTS USPO (
. . . . . . . . . . . . . . . . > state CHAR(2) NOT NULL,
. . . . . . . . . . . . . . . . > city VARCHAR NOT NULL,
. . . . . . . . . . . . . . . . > population BIGINT CONSTRAINT my_pk PRIMARY KEY (state,city));
No rows affected (0.594 seconds)
0: jdbc:phoenix:h40,h41,h42:2181> create index index_uspo on uspo(city,state);
No rows affected (1.235 seconds)
0: jdbc:phoenix:h40,h41,h42:2181> !tables
+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+--------------------+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYP |
+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+--------------------+
| | | INDEX_USPO | INDEX | | |
| | SYSTEM | CATALOG | SYSTEM TABLE | | |
| | SYSTEM | FUNCTION | SYSTEM TABLE | | |
| | SYSTEM | SEQUENCE | SYSTEM TABLE | | |
| | SYSTEM | STATS | SYSTEM TABLE | | |
| | | USPO | TABLE | | |
+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+--------------------+
将测试数据上传到hdfs 上:
[hadoop@h40 ~]$ vi uopu.csv
NY,New York,8143197
CA,Los Angeles,3844829
IL,Chicago,2842518
TX,Houston,2016582
PA,Philadelphia,1463281
AZ,Phoenix,1461575
TX,San Antonio,1256509
CA,San Diego,1255540
TX,Dallas,1213825
CA,San Jose,912332
[hadoop@h40 ~]$ hadoop fs -put uopu.csv /
执行命令:
[hadoop@h40 ~]$ hadoop jar /home/hadoop/phoenix-4.6.0-HBase-1.0-bin/phoenix-4.6.0-HBase-1.0-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -t uspo -i /uopu.csv -z h40,h41,h42:2181
(这条命令在安装的Phoenix官方原版上好使,但是在我用maven重新编译适配cdh5.5.2上却报这个错:
Error: java.lang.ClassNotFoundException: org.apache.commons.csv.CSVFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.<init>(CsvToKeyValueMapper.java:282)
at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.setup(CsvToKeyValueMapper.java:142)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
后来我试探性的将phoenix-4.6.0-cdh5.5.2-client.jar复制到了主节点/home/hadoop/hbase-1.0.0-cdh5.5.2/lib再执行以上命令就好使了)
0: jdbc:phoenix:h40,h41,h42:2181> select * from index_uspo;
+------------------------------------------+--------+
| :CITY | :STATE |
+------------------------------------------+--------+
| Chicago | IL |
| Dallas | TX |
| Houston | TX |
| Los Angeles | CA |
| New York | NY |
| Philadelphia | PA |
| Phoenix | AZ |
| San Antonio | TX |
| San Diego | CA |
| San Jose | CA |
+------------------------------------------+--------+
0: jdbc:phoenix:h40,h41,h42:2181> select * from uspo;
+-------+------------------------------------------+------------------------------------------+
| STATE | CITY | POPULATION |
+-------+------------------------------------------+------------------------------------------+
| AZ | Phoenix | 1461575 |
| CA | Los Angeles | 3844829 |
| CA | San Diego | 1255540 |
| CA | San Jose | 912332 |
| IL | Chicago | 2842518 |
| NY | New York | 8143197 |
| PA | Philadelphia | 1463281 |
| TX | Dallas | 1213825 |
| TX | Houston | 2016582 |
| TX | San Antonio | 1256509 |
+-------+------------------------------------------+------------------------------------------+
(参考博客中最后说发现查询phoenix上的uspo数据是空的我也不知道是咋得出来的。)
hbase(main):003:0> scan 'INDEX_USPO'
ROW COLUMN+CELL
Chicago\x00IL column=0:_0, timestamp=1492430146051, value=
Dallas\x00TX column=0:_0, timestamp=1492430146051, value=
Houston\x00TX column=0:_0, timestamp=1492430146051, value=
Los Angeles\x00CA column=0:_0, timestamp=1492430146051, value=
New York\x00NY column=0:_0, timestamp=1492430146051, value=
Philadelphia\x00PA column=0:_0, timestamp=1492430146051, value=
Phoenix\x00AZ column=0:_0, timestamp=1492430146051, value=
San Antonio\x00TX column=0:_0, timestamp=1492430146051, value=
San Diego\x00CA column=0:_0, timestamp=1492430146051, value=
San Jose\x00CA column=0:_0, timestamp=1492430146051, value=
hbase(main):002:0> scan 'USPO'
ROW COLUMN+CELL
AZPhoenix column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x16MG
AZPhoenix column=0:_0, timestamp=1492430139280, value=
CALos Angeles column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00:\xAA\xDD
CALos Angeles column=0:_0, timestamp=1492430139280, value=
CASan Diego column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x13(t
CASan Diego column=0:_0, timestamp=1492430139280, value=
CASan Jose column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x0D\xEB\xCC
CASan Jose column=0:_0, timestamp=1492430139280, value=
ILChicago column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00+_\x96
ILChicago column=0:_0, timestamp=1492430139280, value=
NYNew York column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00|A]
NYNew York column=0:_0, timestamp=1492430139280, value=
PAPhiladelphia column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x16S\xF1
PAPhiladelphia column=0:_0, timestamp=1492430139280, value=
TXDallas column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x12\x85\x81
TXDallas column=0:_0, timestamp=1492430139280, value=
TXHouston column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x1E\xC5F
TXHouston column=0:_0, timestamp=1492430139280, value=
TXSan Antonio column=0:POPULATION, timestamp=1492430139280, value=\x80\x00\x00\x00\x00\x13,=
TXSan Antonio column=0:_0, timestamp=1492430139280, value=