目前,我们的数据库已经具有插入(insert)和打印(select)的功能了。让我们花点时间来检验一下我们数据库当前功能的正确性。
我准备使用Python来写测试用例,因为我对Python更为熟悉(原文作者使用rspec编写测试用例,感兴趣的同学可以阅读原文)。
我定义了一个简单的函数run_script(),用来把一组命令发送到我们的数据库程序中,并返回命令行的输出结果:
import subprocess
def run_script(commands):
p = subprocess.Popen(".\db.exe", shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
for c in commands:
p.stdin.write(c.encode("GBK"))
out, err = p.communicate()
return out.decode("GBK").split("\n")
基础测试
我们先从最基础的测试开始,插入一行(row)数据然后打印当前数据库所有行(row),完成上述操作后退出程序:
def base_test():
commands = ["insert 1 bob foo@bat.com\n", "select\n", ".exit\n"]
result = run_script(commands)
pattens = ['db > Executed.', 'db > (1, bob, foo@bat.com)', 'Executed.', 'db > ']
for i in range(len(result)):
if (not pattens[i] == result[i]):
print(f"base_test failed, mismatch: {pattens[i]}, {result[i]}")
return
print("base_test passed!")
运行后可以看到测试通过,说明我们在将数据写入数据库后可以成功获取写入的值。
base_test passed!
数据库写满测试
在我们使用测试脚本后,大规模地插入数据已经变成一件简单的事:
def table_full_test():
commands = []
for i in range(1301):
c = f"insert {i} user{i} person{i}@example.com\n"
commands.append(c)
commands.append(".exit\n")
result = run_script(commands)
if(result[-2] == "db > Error: Table full."):
print("table_full_test passed!")
else:
print("table_full_test failed!")
运行测试用例:
table_full_test passed!
测试同样可以通过,说明我们的数据库现在最多能保存1300行数据,是因为我们把页最大数量设为100,同时每个页中最多只能容纳13行数据。
字符串最大长度测试
通读当前代码后,我意识到我们的数据库可能无法正确处理最大长度的字符串,让我们写段代码简单测试一下。
def long_string_test():
long_username = "a"*32
long_email = "a"*255
commands = [f"insert 1 {long_username} {long_email}\n", "select\n", ".exit\n"]
result = run_script(commands)
pattens = ["db > Executed.", f"db > (1, {long_username}, {long_email})", "Executed.", "db > "]
for i in range(len(result)):
if (not result[i] == pattens[i]):
print(f"long_string_test failed, mismatch: {result[i]}, {pattens[i]}")
return
print("long_string_test passed!")
这回我们的测试失败了
long_string_test failed, mismatch: db > (1, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa), db > (1, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
失败的原因也很简单,我们给结构体Row的username和email分别分配了32字节和255字节。但是C语言字符串默认在结尾处会加上空字符’\0’,而我们并没有留出相应的位置。解决的办法是:多分配额外的一个字节用来保存’\0’。
typedef struct {
__uint32_t id;
char username[COLUMN_USERNAME_SIZE+1];
char email[COLUMN_EMAIL_SIZE+1];
} Row;
重新再跑一次后发现测试用例能通过了
long_string_test passed!
字符串过长测试
我们希望对用户插入的username和email的长度有所限制,不让其超过最大值,那么可以通过下面的测试用例来检验。
def overlong_string_test():
long_username = "a"*33
long_email = "a"*256
commands = [f"insert 1 {long_username} {long_email}\n", "select\n", ".exit\n"]
result = run_script(commands)
pattens = ["db > String is too long.", "db > Executed.", "db > "]
for i in range(len(result)):
if (not result[i] == pattens[i]):
print(f"overlong_string_test failed, mismatch: {result[i]}, {pattens[i]}")
return
print("overlong_string_test passed!")
为了支持上述测试用例需要先升级parser,我们目前使用的是sscanf()处理字符串。如果sscanf()读取的字符串超过它所能缓存的大小,会造成缓存溢出并且将数据写到未知的区域。同时数据在拷贝到结构体Row之前,我们想检查每个字符串的长度,因此需要根据空格分隔输入数据,strtok()会是个不错的选择。
PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
statement->type = STATEMENT_INSERT;
char* keyword = strtok(input_buffer->buffer, " ");
char* id_string = strtok(NULL, " ");
char* username = strtok(NULL, " ");
char* email = strtok(NULL, " ");
if (id_string == NULL || username == NULL || email == NULL) {
return PREPARE_SYNTAX_ERROR;
}
int id = atoi(id_string);
if (id < 0) {
return PREPARE_NEGATIVE_ID;
}
if (strlen(username) > COLUMN_USERNAME_SIZE) {
return PREPARE_STRING_TOO_LONG;
}
if (strlen(email) > COLUMN_EMAIL_SIZE) {
return PREPARE_STRING_TOO_LONG;
}
statement->row_to_insert.id = id;
strcpy(statement->row_to_insert.username, username);
strcpy(statement->row_to_insert.email, email);
return PREPARE_SUCCESS;
}
PrepareResult prepare_statement(InputBuffer* input_buffer, Statement* statement) {
if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
return prepare_insert(input_buffer, statement);
}
if (strcmp(input_buffer->buffer, "select") == 0) {
statement->type = STATEMENT_SELECT;
return PREPARE_SUCCESS;
}
return PREPARE_UNRECOGNIZED_STATEMENT;
}
我们在输入缓冲区上连续调用strtok(),每当它到达分隔符(在我们的示例中是空格)时,插入一个空字符,将其分解为子字符串,并且返回一个指向子字符串开头的指针。
调用strlen()可以检查每段字符串的长度,从而判断其是否超过最大值限制。
我们可以像处理其他错误代码一样处理字符串过长的错误:
typedef enum {
PREPARE_SUCCESS,
PREPARE_STRING_TOO_LONG,
PREPARE_SYNTAX_ERROR,
PREPARE_UNRECOGNIZED_STATEMENT
} PrepareResult;
...
switch (prepare_statement(input_buffer, &statement)) {
case (PREPARE_SUCCESS):
break;
case (PREPARE_STRING_TOO_LONG):
printf("String is too long.\n");
continue;
case (PREPARE_SYNTAX_ERROR):
printf("Syntax error. Could not parse statement.\n");
continue;
case (PREPARE_UNRECOGNIZED_STATEMENT):
printf("Unrecognized keyword at start of '%s'.\n", input_buffer->buffer);
continue;
}
...
修改后再运行测试用例就能通过了
overlong_string_test passed!
负数ID测试
最后让我们添加检测ID值为负数的测试用例:
def negtive_id_test():
commands = ["insert -1 bob foo@bar.com\n", "select\n", ".exit\n"]
result = run_script(commands)
pattens = ["db > ID must be positive.", "db > Executed.", "db > "]
for i in range(len(result)):
if (not result[i] == pattens[i]):
print(f"negtive_id_test failed, mismatch: {result[i]}, {pattens[i]}")
return
print("negtive_id_test passed!")
我们需要同步修改一下我们的数据库代码:
typedef enum {
PREPARE_SUCCESS,
PREPARE_NEGATIVE_ID,
PREPARE_STRING_TOO_LONG,
PREPARE_SYNTAX_ERROR,
PREPARE_UNRECOGNIZED_STATEMENT
} PrepareResult;
...
int id = atoi(id_string);
if (id < 0) {
return PREPARE_NEGATIVE_ID;
}
if (strlen(username) > COLUMN_USERNAME_SIZE) {
return PREPARE_STRING_TOO_LONG;
}
if (strlen(email) > COLUMN_EMAIL_SIZE) {
return PREPARE_STRING_TOO_LONG;
}
...
switch (prepare_statement(input_buffer, &statement)) {
case (PREPARE_SUCCESS):
break;
case (PREPARE_NEGATIVE_ID):
printf("ID must be positive.\n");
continue;
case (PREPARE_STRING_TOO_LONG):
printf("String is too long.\n");
continue;
...
运行后测试通过:
negtive_id_test passed!
好了,测试用例的编写就到此为止了。接下来的章节将要介绍数据库非常重要的特性:持久性!我们会把数据库保存到文件中,然后再把它读出来,敬请期待。
原文链接:Let’s Build a Simple Database: Part 4 - Our First Tests (and Bugs)