Phoenix Tips (6) 索引基础

最新推荐文章于 2021-11-10 17:00:45 发布

weixin_30919429

最新推荐文章于 2021-11-10 17:00:45 发布

阅读量92

点赞数

文章标签：大数据

原文链接：http://www.cnblogs.com/leeeee/p/7276373.html

版权

1、immutable Index

原文：Immutable indexing targets use cases that are write once, append only; this is common in time-series data, where you log once, but read multiple times. In this case, the indexing is managed entirely on the client - either we successfully write all the primary and index data or we return a failure to the client. Since once written, rows are never updated, no incremental index maintenance is required making them perform very well. This reduces the overhead of secondary indexing at write time. However, keep in mind that immutable indexing are only applicable in a limited set of use cases.

One restriction of immutable indexes is that rows from the data table may not be deleted. Instead, the only way to delete rows is to drop the entire data table.

Immutable 索引适用于一次写入，数据只添加不修改的情况，例如时间序列数据。因为只需要一次写入，数据行不会更新，不需要额外的索引维护，所以性能非常好。

例子：

CREATE TABLE my_table (k VARCHAR PRIMARY KEY, v1 VARCHAR, v2 VARCHAR , v3 VARCHAR) IMMUTABLE_ROWS=true;

CREATE INDEX my_index ON my_table (v2 DESC, v1) INCLUDE (v3);

2.Global Indexing (Mutable)

原文：Global indexing targets read heavy, low write uses cases. With global indexes, all the performance penalties for indexes occur at write time. We intercept the data table updates on write (DELETE, UPSERT VALUES and UPSERT SELECT), build the index update and then sent any necessary updates to all interested index tables. At read time, Phoenix will select the index table to use that will produce the fastest query time and directly scan it just like any other HBase table. Note, however, if a column is referenced in a query that isn’t part of the index, the index will not be used for that query.

全局索引适用于高数据量读取，低数据量写入的情况。全局索引的消耗主要在索引写入的时候，在对数据表进行写操作的时候，同时更新所有的索引表。在读取的时候，将直接从索引表扫描读取数据，如果读取另外一个表一样。如果query中某个列在索引表中不存在，全局索引将不会用到。

CREATE TABLE my_table (k VARCHAR PRIMARY KEY, v1 VARCHAR, v2 VARCHAR , v3 VARCHAR)

CREATE INDEX my_index ON my_table (v2 DESC, v1) INCLUDE (v3);

3.Local Indexing (Mutable)

原文： Local indexing targets write heavy, space constrained use cases. With local indexes index data and table data are co-reside at same server so no network overhead during writes and reads. Local indexes can be used even when the query isn’t fully covered i.e. Phoenix automatically retrieve the columns not in the index through point gets against the data table. Unlike global indexes all local indexes data of a table are stored in a separate shared table. At read time when the local index is used, every region must be examined for the data as the exact region location of index data cannot be predetermined which incurs some overhead.

局部索引适用于大数据量写、空间受限的情况下。使用局部索引，索引数据和表数据将同时放在同一台server上，所以在读写的时候不会有网络通信的开销。如果query中某个列在索引表中不存在，局部索引也能用到。不同于全局索引，一个表的全部局部索引的数据保存在同一个共享表中。在读取的时候，每一个 region 都必须检查数据，因为索引数据的确切的区域位置无法预先，确定这会增加一些系统开销。

例子：

CREATE TABLE my_table (k VARCHAR PRIMARY KEY, v1 VARCHAR, v2 VARCHAR , v3 VARCHAR)

CREATE LOCAL INDEX my_index ON my_table (v2 DESC, v1) INCLUDE (v3);

4、不会使用到二级索引的情况

创建表：create table usertable (id varchar primary key, firstname varchar, lastname varchar);

创建全局索引：create index idx_name on usertable (firstname);

检索：select id, firstname, lastname from usertable where firstname = 'foo';

不会使用到索引 idx_name。若要使用到，必须这样：

create idx_name on usertable (firstname)include(lastname);

5、不会使用主键索引情况：

创建表：CREATE TABLE TEST (pk1 char(1) not null, pk2 char(1) not null, pk3 char(1) not null, non-pk varchar CONSTRAINT PK PRIMARY KEY(pk1, pk2, pk3) );

不会使用到索引的检索：select * from test where pk2='x' and pk3='y'

会使用到索引的检索：select * from test where pk1='x' and pk2='y'

转载于:https://www.cnblogs.com/leeeee/p/7276373.html