【译】数据库基础:用 Go 从零开始写一个 SQL 数据库 —— 第一部分

Database basics: writing a SQL database from scratch in Go —— part 1

【译】数据库基础:用 Go 从零开始写一个 SQL 数据库 —— 第一部分

In this series we'll write a rudimentary database from scratch in Go. Project source code is available on Github.

在这个系列文章中,我们将使用 Go 从零开始写一个很基础的数据库。项目源代码可在 Github 找到。

In this first post we'll build enough of a parser to run some simple CREATE, INSERT, and SELECT queries. Then we'll build an in-memory backend supporting TEXT and INT types and write a basic REPL.

在第一篇中,我们将构建解析器来运行一些简单的 CREATEINSERTSELECT 查询。然后,构建一个支持 TEXTINT 等后端内存的类型,并编写一个基本的 REPL。

We'll be able to support the following interaction:

项目完成后,我们将支持下面的交互:

$ go run *.go
Welcome to gosql.
# CREATE TABLE users (id INT, name TEXT);
ok
# INSERT INTO users VALUES (1, 'Phil');
ok
# SELECT id, name FROM users;
| id | name |
====================
| 1 |  Phil |
ok
# INSERT INTO users VALUES (2, 'Kate');
ok
# SELECT name, id FROM users;
| name | id |
====================
| Phil |  1 |
| Kate |  2 |
ok

The first stage will be to map a SQL source into a list of tokens (lexing). Then we'll call parse functions to find individual SQL statements (such as SELECT). These parse functions will in turn call their own helper functions to find patterns of recursively parseable chunks, keywords, symbols (like parenthesis), identifiers (like a table name), and numeric or string literals.

第一步是将 SQL 源映射为 token 列表(词法分析)。然后我们将调用解析函数来解析单个 SQL 语句(如 SELECT)。这些解析函数将依次调用一些辅助函数,以发现可递归解析的语句块、关键字、符号(如“括号”)、标识符(如“表名称”)和数字或字符串文字。

Then, we'll write an in-memory backend to do operations based on an AST. Finally, we'll write a REPL to accept SQL from a CLI and pass it to the in-memory backend.

然后,我们将编写一个内存后端来执行基于 AST 的操作,最后,我们会编写一个 REPL 来接收命令行下的 SQL 并将其传递到内存后端进行解析。


This post assumes a basic understanding of parsing concepts. We won't skip any code, but also won't go into great detail on why we structure the way we do.

本文是假定你对解析概念有了基本的了解之上的。我们不会跳过任何代码,但也不会详细讨论为何要这样写。

For a simpler introduction to parsing and parsing concepts, see this post on parsing JSON.

有关解析和解析概念的更简单介绍,请参阅该解析 json 文章


lexing

词法分析

The lexer is responsible for finding every distinct group of characters in source code: tokens. This will consist primarily of identifiers, numbers, strings, and symbols.

lexer(词法分析器)负责查找源码中的每一组不同的字符:tokens(令牌)。它主要由标识符、数字、字符串和符号组成。

The gist of the logic will be to iterate over the source string and collect characters until we find a delimiting character such as a space or comma. In this first pass, we'll pretend users don't insert delimiting characters into strings. Once we've reached a delimiting character, we'll "finalize" the token and decide whether it is valid or not.

主要逻辑是迭代源字符串并收集字符,直到找到分隔符(如空格、逗号)。在第一遍字符传递中,我们假设用户没有将分隔字符插入到字符串中。一旦遇到一个分隔符,我们将会”最终确定“并标记它是否有效。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值