- 作者:中兴大数据 |
http://www.icaijing.com/hot/article4940159/
Phoenix二级索引那些事儿(下)
索引配置 公共配置 hbase-site.xml hbase.regionserver.wal.codec org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec 全局索引配置 hbase-site.xml配置项 支持(HBase0.98.4+ and Phoenix 4.3.1+ only) hbase.region.server.rpc.scheduler.factory.class org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory Factory to create thePhoenix RPC Scheduler that uses separate queues for index and metadataupdates hbase.rpc.controllerfactory.class org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory Factory to create thePhoenix RPC Scheduler that uses separate queues for index and metadataupdates 局部索引配置 hbase-site.xml配置项 hbase.master.loadbalancer.class org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer hbase.coprocessor.master.classes org.apache.phoenix.hbase.index.master.IndexMasterObserver hbase.coprocessor.regionserver.classes org.apache.hadoop.hbase.regionserver.LocalIndexMerger 索引配置调优 hbase-site.xml配置项 index.builder.threads.max Default: 10
index.builder.threads.keepalivetime Default: 60
index.writer.threads.max Default: 10
index.writer.threads.keepalivetime Default: 60
hbase.htable.threads.max Default: 2,147,483,647
hbase.htable.threads.keepalivetime Default: 60
index.tablefactory.cache.size Default: 10
org.apache.phoenix.regionserver.index.priority.min Default: 1000
org.apache.phoenix.regionserver.index.priority.max Default: 1050
org.apache.phoenix.regionserver.index.handler.count Default: 30
其它功能 Phoenix子查询 IN和Not In的子查询 例子 SELECT ItemName FROM Items WHERE ItemID IN (SELECT ItemID FROM Orders WHERE Date >= to_date('2013/09/02')); Exists和Not Exists的子查询 例子 SELECT ItemName FROM Items i WHERE EXISTS (SELECT * FROM Orders WHERE Date >= to_date('2013/09/02') AND ItemID = i.ItemID); 半连接、反连接、join 例子 SELECTd.dept_id,e.dept_id,e.name FROM DEPT d JOIN EMPL e ON e.dept_id = d.dept_id; JOIN支持: INNER LEFT OUTER RIGHT 比较运算 例子 SELECT ID, Name FROM Contest WHERE Score > (SELECT avg(Score) FROM Contest) ORDER BY ScoreDESC; ANY/SOME/ALL运算 例子 SELECT OrderID FROM Orders WHERE quantity>= ANY (SELECT max(quantity) FROM Orders GROUP BY ItemID); 相关子查询 例子 SELECT PatentID,Title FROM Patents p WHERE FileDate , LIMIT SELECT title,author, isbn, description FROM library WHEREpublished_date > 2010 AND (title, author,isbn) > (?, ?, ?) ORDER BY title,author, isbn LIMIT 20 Phoenix 统计收集 统计收集有助于提升query性能。 命令: UPDATE STATISTICSmy_table 等效于 UPDATE STATISTICSmy_table ALL 如果只收集index或者column UPDATE STATISTICSmy_table INDEX UPDATE STATISTICSmy_table COLUMNS 参数配置 phoenix.stats.guidepost.width默认104857600 phoenix.stats.guidepost.per.region phoenix.stats.updateFrequency默认900000 (15 mins) phoenix.stats.minUpdateFrequency默认7.5 mins phoenix.stats.useCurrentTime 默认true Phoenix 常用语法 Commands Other Grammar Phoenix 常用函数 Aggregate Functions String Functions Time and Date Functions Numeric Functions Array Functions Other Functions Phoenix 数据类型 Phoenix 使用任意时间戳 在Property里面设置属性 \"CurrentSCN\"。ts是一个long。 Properties props = new Properties(); props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts)); Connection conn = DriverManager.connect(myUrl, props); conn.createStatement().execute(\"UPSERT INTO myTable VALUES ('a')\"); conn.commit(); 相当于: myTable.put(Bytes.toBytes('a'),ts); Phoenix 性能提升 1、加盐: 加盐可以将数据存入多个region里,从而提升读写性能。 CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SALT_BUCKETS=42 如果有16台region server,每台server有4核CPU,则SALT_BUCKETS 设置为32~64之间。即如果集群总的CPU核数为N,则SALT_BUCKETS为 0.5N ~ N 之间。 加盐后的注意事项:
2、split 如果不想通过加盐来分区,可以自己手动设置分区的方法。这样可以不引入额外的byte,或者改变row key的顺序,例子 CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) SPLIT ON ('CS','EU','NA') 3、使用多个列族 CREATE TABLE TEST (MYKEY VARCHAR NOT NULL PRIMARY KEY, A.COL1 VARCHAR, A.COL2 VARCHAR, B.COL3 VARCHAR) 4、使用压缩 CREATE TABLE TEST (HOST VARCHAR NOT NULL PRIMARY KEY, DESCRIPTION VARCHAR) COMPRESSION='GZ' 5、使用二级索引 6、优化集群 7、优化phoenix 参数 Phoenix 跳过SCAN The List for SkipScanFilter forthe above query would be [ [ [ a - b ], [ d - e ] ], [ 1, 2 ] ] where [ [ a - b], [ d - e ] ] is the range for KEY1and [ 1, 2 ] keys for KEY2. 例子 SELECT * from T WHERE ((KEY1>='a' AND KEY1 'c' AND KEY1 |