ndb存储引擎的索引
ndb存储引擎支持两种索引(见CREATE INDEX Syntax): hash和btree
Storage Engine
Permissible Index Types
InnoDB
BTREE
MyISAM
BTREE
MEMORY/HEAP
BTREE
NDB
HASH, BTREE
hash
mysql实现的hash是一种key-value,一一对应,不允许存在相同的hash值,所以只能在primary key和unique字段上指定使用hash索引,且需要显示指定using hash, 如:
create table test
(
id bigint,
seq int,
description varchar(200),
primary key (id, seq) using hash
) engine = ndb
partition by key(id);
查看创建的索引show index from test;
btree
ndb存储引擎使用T-tree数据结构来实现btree索引的(见CREATE INDEX Syntax):
BTREE indexes are implemented by the NDB storage engine as T-tree indexes.
如果primary key和unique不显示使用hash, 则默认使用btree:
For indexes on NDB table columns, the USING option can be specified only for a unique index or primary key. USING HASH prevents the creation of an ordered index; otherwise, creating a unique index or primary key on an NDB table automatically results in the creation of both an ordered index and a hash index, each of which indexes the same set of columns.
create index创建的索引都是btree, ndb存储引擎不支持在创建索引的时候使用using关键字来指定索引类型
例如:
create index idx_test_id on test
(
id
);
查看创建的索引show index from test;
批量操作,一次执行
mysql ndb支持多次操作,一次性执行(见The NdbTransaction Class)
Several operations can be defined on the same NdbTransaction object, in which case they are executed in parallel. When all operations are defined, the execute() method sends them to the NDB kernel for execution.
可以通过多次调用NdbTransaction::getNdbOperation()获取多个NdbOperation,针对NdbOperation的增删改只修改本地的数据,当执行NdbTransaction::execute()时,NdbTransaction下的多个NdbOperation被同时发到服务器端,并行执行。
性能测试
hash
使用上面创建的test表,插入一条测试数据:
insert into test(id, seq, description)
values(1, 0, 'this is a description');
根据主键(即hash索引)查询记录,以下是测试结果:
事物数
操作数
总耗时(s)
150
20
0.14578
300
10
0.20860
600
5
0.35560
3000
1
1.47742
以第一行数据为例,总共150个事物,每个事物有20个操作,这20个操作是一起发送到服务器端的,以下是每个函数所消耗的时间:
Func:[close ] Total:[ 0.04425s] Count:[ 150] 0~0.005s[ 150]
Func:[execute ] Total:[ 0.09601s] Count:[ 150] 0~0.005s[ 150]
Func:[readTuple ] Total:[ 0.00031s] Count:[ 3000] 0~0.005s[ 3000]
Func:[getNdbOperation] Total:[ 0.00171s] Count:[ 3000] 0~0.005s[ 3000]
Func:[all ] Total:[ 0.14578s] Count:[ 1] 0~0.005s[ 0]
Total表示该函数总耗时,Count表示执行的次数,最后一行记录是总耗时。
时间主要消耗在NdbTransaction::execute()和NdbOperation::close()。
根据hash查询,NdbTransaction::execute()函数在提交操作的同时,阻塞等待返回的结果。
以下是测试代码的核心部分:
void test_pk(Ndb* pNdb, int argc, char* argv[])
{
int cnt_transaction = 1;
int cnt_query = 1;
if (argc >= 2)
{
cnt_transaction = atoi(argv[1]);
}
if (argc >= 3)
{
cnt_query = atoi(argv[2]);
}
const char* tableName = "test";
const NdbDictionary::Dictionary* pDict= pNdb->getDictionary();
const NdbDictionary::Table *pTable= pDict->getTable(tableName);
if (pTable == NULL)
APIERROR(pDict->getNdbError());
Int64 id = 1;
Int32 seq = 0;
NdbOperation** operations = new NdbOperation*[cnt_query];
extern char g_progName[64];
strncpy(g_progName, argv[0], sizeof(g_progName)-1);
TPerfStatMgr::Instance().SetSwitch(true);
PS_START(all);
for (int i = 0; i < cnt_transaction; ++i)
{
NdbTransaction *pTransaction = pNdb->startTransaction();
if (pTransaction == NULL)
APIERROR(pNdb->getNdbError());
for (int j = 0; j < cnt_query; ++j)
{
PS_START(getNdbOperation);
NdbOperation *pOperation= pTransaction->getNdbOperation(pTable);
if (pOperation == NULL)
APIERROR(pTransaction->getNdbError());
PS_END(getNdbOperation);
operations[j] = pOperation;
PS_START(readTuple);
if (pOperation->readTuple() == -1)
APIERROR(pNdb->getNdbError());
PS_END(readTuple);
if (pOperation->equal("id", id) == -1)
APIERROR(pOperation->getNdbError());
if (pOperation->equal("seq", seq) == -1)
APIERROR(pOperation->getNdbError());
GetColumns(pTransaction, pOperation);
}
PS_START(execute);
if(pTransaction->execute(NdbTransaction::NoCommit) == -1)
APIERROR(pTransaction->getNdbError());
PS_END(execute);
for (int j = 0; j < cnt_query; ++j)
{
//ignore result
}
PS_START(close);
pTransaction->close();
PS_END(close);
}
PS_END(all);
delete[] operations;
}
void GetColumns(NdbTransaction *pTransaction, NdbOperation *pOperation)
{
GetColumn(pTransaction, pOperation, "id");
GetColumn(pTransaction, pOperation, "seq");
}
void GetColumn(NdbTransaction *pTransaction, NdbOperation *pOperation, const char* column_name)
{
const NdbRecAttr *pRecAttr = pOperation->getValue(column_name, NULL);
if (pRecAttr == NULL)
APIERROR(pTransaction->getNdbError());
}
btree
根据btree索引字段id查询记录,以下是测试结果:
事物数
操作数
总耗时(s)
150
20
0.77634
300
10
0.90511
600
5
0.93458
3000
1
1.54747
分析第一行测试结果(150个事物)的详细记录:
Func:[close ] Total:[ 0.00571s] Count:[ 3000] 0~0.005s[ 3000]
Func:[nextResult ] Total:[ 0.73533s] Count:[ 3000] 0~0.005s[ 3000]
Func:[execute ] Total:[ 0.01163s] Count:[ 150] 0~0.005s[ 150]
Func:[readTuples ] Total:[ 0.00039s] Count:[ 3000] 0~0.005s[ 3000]
Func:[getNdbIndexScanOperation] Total:[ 0.01603s] Count:[ 3000] 0~0.005s[ 3000]
Func:[all ] Total:[ 0.77634s] Count:[ 1] 0~0.005s[ 0]
时间主要消耗在NdbIndexScanOperation::nextResult()函数,NdbTransaction::execute()和NdbIndexScanOperation::close()所占的比重很小,与hash的结果大相径庭,NdbOperation函数没有nextResult函数,执行NdbTransaction::execute()即得到了唯一的结果。
以下是btree的测试代码:
void test_btree(Ndb* pNdb, int argc, char* argv[])
{
int cnt_transaction = 1;
int cnt_query = 1;
if (argc >= 2)
{
cnt_transaction = atoi(argv[1]);
}
if (argc >= 3)
{
cnt_query = atoi(argv[2]);
}
const char* tableName = "test";
const NdbDictionary::Dictionary* pDict= pNdb->getDictionary();
const NdbDictionary::Table *pTable= pDict->getTable(tableName);
if (pTable == NULL)
APIERROR(pDict->getNdbError());
const NdbDictionary::Index *pIndex= pDict->getIndex("idx_test_id",pTable->getName());
if (pIndex == NULL)
APIERROR(pDict->getNdbError());
Int64 id = 1;
NdbIndexScanOperation** operations = new NdbIndexScanOperation*[cnt_query];
extern char g_progName[64];
strncpy(g_progName, argv[0], sizeof(g_progName)-1);
TPerfStatMgr::Instance().SetSwitch(true);
PS_START(all);
for (int i = 0; i < cnt_transaction; ++i)
{
NdbTransaction* pTransaction = pNdb->startTransaction();
if (pTransaction == NULL)
APIERROR(pNdb->getNdbError());
for (int j = 0; j < cnt_query; ++j)
{
PS_START(getNdbIndexScanOperation);
NdbIndexScanOperation* pIndexOperation = pTransaction->getNdbIndexScanOperation(pIndex);
if (pIndexOperation == NULL)
APIERROR(pTransaction->getNdbError());
PS_END(getNdbIndexScanOperation);
operations[j] = pIndexOperation;
PS_START(readTuples);
//attention: not readTuple
pIndexOperation->readTuples();
PS_END(readTuples);
pIndexOperation->equal("id", id);
GetColumns(pTransaction, pIndexOperation);
}
PS_START(execute);
if(pTransaction->execute(NdbTransaction::NoCommit, NdbOperation::AbortOnError, 0) == -1)
APIERROR(pTransaction->getNdbError());
PS_END(execute);
for (int j = 0; j < cnt_query; ++j)
{
NdbIndexScanOperation* pIndexOperation = operations[j];
PS_START(nextResult);
while (pIndexOperation->nextResult(true) == 0)
{
//ignore result
}
PS_END(nextResult);
PS_START(close);
pIndexOperation->close();
PS_END(close);
}
pTransaction->close();
}
PS_END(all);
delete[] operations;
}
NdbScanOperation::nextResult()函数的第一个参数表示是否到服务器获取记录。如果置为true,将获取一批数据,当本地数据遍历完,将从服务器端再次获取。如果第一次调用就置为false将获取不到记录(已测试过)。
最后一次nextResult(true)调用,会进行一次网络交互吗?本例中,nextResult只会调用两次,现在对这两次分别统计耗时,代码修改:
PS_START(nextResult1);
pIndexOperation->nextResult(true);
PS_END(nextResult1);
PS_START(nextResult2);
pIndexOperation->nextResult(true);
PS_END(nextResult2);
PS_START(close);
pIndexOperation->close();
PS_END(close);
测试结果:
Func:[close ] Total:[ 0.00476s] Count:[ 3000] 0~0.005s[ 3000]
Func:[nextResult2 ] Total:[ 0.62985s] Count:[ 3000] 0~0.005s[ 3000]
Func:[nextResult1 ] Total:[ 0.09145s] Count:[ 3000] 0~0.005s[ 3000]
Func:[execute ] Total:[ 0.01087s] Count:[ 150] 0~0.005s[ 150]
Func:[readTuples ] Total:[ 0.00031s] Count:[ 3000] 0~0.005s[ 3000]
Func:[getNdbIndexScanOperation] Total:[ 0.01447s] Count:[ 3000] 0~0.005s[ 3000]
Func:[all ] Total:[ 0.75860s] Count:[ 1] 0~0.005s[ 0]
时间主要消耗在第二次nextResult(true),本例中只有一条记录,第一次nextResult(true)已经取到记录(测试过,即使有62条记录,一次nextResult调用也可以全部取到),那么第二次的nextResult参数传false是不是可以减少一次网络交互而减少耗时呢?修改代码:
PS_START(nextResult1);
pIndexOperation->nextResult(true);
PS_END(nextResult1);
PS_START(nextResult2);
pIndexOperation->nextResult(false);
PS_END(nextResult2);
PS_START(close);
pIndexOperation->close();
PS_END(close);
测试结果:
Func:[close ] Total:[ 0.63609s] Count:[ 3000] 0~0.005s[ 3000]
Func:[nextResult2 ] Total:[ 0.00053s] Count:[ 3000] 0~0.005s[ 3000]
Func:[nextResult1 ] Total:[ 0.09279s] Count:[ 3000] 0~0.005s[ 3000]
Func:[execute ] Total:[ 0.01139s] Count:[ 150] 0~0.005s[ 150]
Func:[readTuples ] Total:[ 0.00032s] Count:[ 3000] 0~0.005s[ 3000]
Func:[getNdbIndexScanOperation] Total:[ 0.01554s] Count:[ 3000] 0~0.005s[ 3000]
Func:[all ] Total:[ 0.76394s] Count:[ 1] 0~0.005s[ 0]
根据结果,nextResult(false)几乎不耗时间了,现在转移到NdbIndexScanOperation::close(),总的性能没有改变
猜测(待验证):最后一次调用的是nextResult(true),进行了一次网络交互,释放服务器资源,再次调用close()不需要网络交互;最后一次调用的是nextResult(false),释放服务器资源的动作由close来做。
后续又做了其他的尝试,对性能的提升,没有实质的影响:
1. 因为只涉及到查询操作,所以可以只使用一个事物(这种情况下,NdbTransaction::execute()的execType参数不能是Commit)
2. 只使用一个NdbIndexScanOperation,不执行NdbIndexScanOperation::close(这种情况下,NdbIndexScanOperation::nextResult的fetchAllowed参数必须是true)
总结
mysql ndb接口的查询,支持一次执行多次操作(一次网络交互)
按hash索引查询,支持多次操作的结果,一次性返回(一次网络交互)
按btree索引查询,每个操作自己发消息取数据,释放资源需要进行另外一次网络交互(此结论是根据本文测试的结果,推理出来的,不是从源码或者官方文档中获得的)
Reference
CREATE INDEX Syntax Comparison of B-Tree and Hash Indexes The NdbTransaction Class NdbScanOperation::nextResult()