用户视角的wiredtiger -- Storage options (1) -- 模式、列簇、索引和投影

最新推荐文章于 2024-02-05 21:58:31 发布

caixinGO

最新推荐文章于 2024-02-05 21:58:31 发布

阅读量266

点赞数

分类专栏： wiredtiger

本文链接：https://blog.csdn.net/gongcaixin/article/details/104740178

版权

wiredtiger 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

本文主要介绍Schema, Columns, Column Groups, Indices and Projections 。
wiredtiger不仅支持简单的K/V格式的table，还支持设置schema。

Tables, rows and columns， 支持行存和列存两种方式。行存的方式为将一行的所有元素（列）存储为一个元素，然后依次存下一行；列存将一列的所有元素（行）存储为一个元素，然后依次存储下一列。wiredtiger还可以混合使用两种存储方式。wiredtiger支持将一个表拆分成一个或多个列簇（column family)，所有列簇存储的所有列需包含了表中的任一列，每个列簇的key为表的primary key（即唯一的KEY）。每个表可以创建0个或更多个索引，使得能够快速的按非primary key的顺序查找表中的列。

Column types， wiredtiger以传统的KV store 工作。可以指定表中的一列或多列为key或value，其可以直接通过WT_ITEM的格式读取key/value的raw byte array。

Key/Value pairs， 底层wiredtiger文件以KV的方式进行存储。file cursor可以直接遍历该文件的KV。以一个index为例，其会以KV的格式存在一个文件中，通过直接读取该文件，可以避免当该索引与Table的数据不一致的错误（因为处于性能的考虑，可以对索引配置为“对更新操作不更新索引的记录”）。在行存的格式中，KV是变长的字符串（K或V的长度最多为4GB - 512B）；在列存储中，key为64位的record number，value为变长的字符串（长度最多为4GB - 512B）或1-8位的bit序列。

Format types， 使用了类似Python的format string描述table中每一列的类型。其中，对于’t’类型，其除了用于在列存中，表示record number的key外，和’Q’类型一样，用于表示uint64_t。其他具体类型这里略去。
Packing and Unpacking Data 介绍了根据format string打包数据的方法， it naturally sorts in lexicographic order, and the packed format uses variable-sized encoding of values to reduce the data size。可通过wiredtiger_struct_pack和wiredtiger_struct_unpack进行数据的pack和unpack（例子见ex_pack.c.）。
WT_COLLATOR struct 提供了表中记录自定义排序的接口。

Key and value formats， 如上面Column types所说，可以指定一列或多列为key或value。通过将参数key_format和value_format传递给WT_SESSION::create函数可以指定列的类型，如K/V都只包含单个变长字符串的，以行存为存储格式的表，可以通过如下方式创建

 error_check(session->create(session, "table:mytable", "key_format=S,value_format=S"));

value只包含单个单个变长字符串的，以列存的存储格式的表，可以通过如下方式创建

error_check(session->create(session, "table:mytable", "key_format=r,value_format=S"));

Cursor formats，cursor 的key的格式和表的格式一样，通过 WT_CURSOR::set_key设置cursor的key，通过WT_CURSOR::get_key读取cursor的key。其参数数目为组成key的column的个数（对于不同的表，参数数目可变），其使用方法分别类似printf和scanf。cursor的value格式和表的格式一样，除非在 WT_SESSION::open_cursor时使用了Projection。通过 WT_CURSOR::set_value设置cursor的value，通过WT_CURSOR::get_value读取cursor的value，其使用方法和key对应函数的使用方法一样。

Columns，通过columns 参数传递给WT_SESSION::create函数以指定每个column的名字。column的名字首先赋给key_format指定的column，然后是value_format指定的column。必须指定每个column指定名字，且名字不能重复。如创建一个以列存为存储方式的表：

    /*
     * Create a table with columns: keys are record numbers, values are
     * (string, signed 32-bit integer, unsigned 16-bit integer).
     */
    error_check(session->create(session, "table:mytable",
      "key_format=r,value_format=SiH,"
      "columns=(id,department,salary,year-started)"));

一旦表创建成功，无需再调用WT_SESSION::create函数。然而，仍可以通过调用本函数验证表格是否存在，table schema是否匹配（exclusive参数，默认为false，当true时，表格存在会报错，当false时，当表格已经存在时会检查配置是否匹配）。

Column groups，当使用使用column names时，便可以进行column group的配置。列簇主要用于定义存储的格式以对cache的行为进行调优, 因为每个column group是分开存储在各自的一个文件中的。创建列簇分为两个步骤：1. WT_SESSION::create创建表时（URI为table:table_name)时，指定colgroups参数添加要创建的若干个列簇的名字；2. 对每一个列簇，通过WT_SESSION::create将URI参数设置为table:table_name:column_group_name 创建列簇，以及columns的配置指定列簇中每个列的名字。每个列至少在一个列簇中出现，可以在多个列簇中出现而使得被存储多次。提供一个例子：

    /*
     * Create the population table. Keys are record numbers, the format for values is (5-byte
     * string, uint16_t, uint64_t). See ::wiredtiger_struct_pack for details of the format strings.
     */
    error_check(session->create(session, "table:poptable",
      "key_format=r,"
      "value_format=5sHQ,"
      "columns=(id,country,year,population),"
      "colgroups=(main,population)"));
    /*
     * Create two column groups: a primary column group with the country code, year and population
     * (named "main"), and a population column group with the population by itself (named
     * "population").
     */
    error_check(
      session->create(session, "colgroup:poptable:main", "columns=(country,year,population)"));
    error_check(session->create(session, "colgroup:poptable:population", "columns=(population)"));

每个列簇的key总与table的key相同。这对于列存的存储格式的表尤其有用，因为record number（即key）不会显式存储在磁盘中，所以在多个文件中不会重复存储key。如果以行存的存储格式存储，key会在每个列簇中重复存储。用户通过在WT_SESSION::open_cursor中指定列簇的URI来打开基于列簇的cursor，如本例子中的URI为colgroup:poptable:main 或 colgroup:poptable:population。

Indices，列（名）可用于创建和配置索引。当表进行了更新，索引自动进行更新。表的索引是只读的，用户不能进行更新。通过WT_SESSION::create 函数指定索引的URI index:table:index_name，同时通过columns配置指定一个或多个列名，来创建索引。例如：

    /* Create an index with a simple key. */
    error_check(session->create(session, "index:poptable:country", "columns=(country)"));
    /* Create an index with a composite key (country,year). */
    error_check(session->create(session, "index:poptable:country_plus_year", "columns=(country,year)"));

通过 WT_SESSION::open_cursor时的URI参数指定为index的URI来打开索引。索引的key由打开时columns参数指定的一个或多个列组成。索引的value为表的value。例如：

    /* Search in a composite index. */
    error_check(session->open_cursor(session, "index:poptable:country_plus_year", NULL, NULL, &cursor));
    cursor->set_key(cursor, "USA\0\0", (uint16_t)1900);
    error_check(cursor->search(cursor));
    error_check(cursor->get_value(cursor, &country, &year, &population));
    printf("US 1900: country %s, year %" PRIu16 ", population %" PRIu64 "\n", country, year, population);

Immutable indices，在创建索引时，可以通过immutable参数，使得表中主键对应的value更新时，不更新索引（先删除再插入一条新的）以优化性能。然而，当更新后，索引数据与实际数据将不一致。

Index cursor projections，默认情况下，index cursor返回的value为表中所有的value列。调用 WT_SESSION::open_cursor函数时，在URI的后面加上"（a list of column names）"，可以使得cursor在get_value时只获得表的一部分value列（即Projection）。如果project的列都可以在index中获得（包括 primary key columns, 其为index的value), 那么这些数据只需要从index或得即可，无需从其他列簇中获得。有了这一点，可以通过在index中冗余key中存储的列来（真正的组成key的column存储在前面）避免读取其他column family的数据。
对于以列式存储的表，不需要创建基于record number（为primary key)的index，无作用。

Code Examples, 上述的例子在 ex_schema.c可见；ex_call_center.c.也包含了使用列簇，索引，以及通过索引来模拟SQL加速查找的例子。