【译】数据库基础：用 Go 从零开始写一个 SQL 数据库 —

本文链接：https://blog.csdn.net/suhanyujie/article/details/118403895

Database basics: writing a SQL database from scratch in Go 译文

原文链接：http://notes.eatonphil.com/database-basics.html

原文作者：https://github.com/eatonphil

译文来自：https://github.com/suhanyujie/article-transfer-rs/

译者：suhanyujie

译者博客：suhanyujie

ps：水平有限，翻译不当之处，还请指正。

标签：数据库，golang，解析

Database basics: writing a SQL database from scratch in Go —— part 1

【译】数据库基础：用 Go 从零开始写一个 SQL 数据库 —— 第一部分

In this series we'll write a rudimentary database from scratch in Go. Project source code is available on Github.

在这个系列文章中，我们将使用 Go 从零开始写一个很基础的数据库。项目源代码可在 Github 找到。

In this first post we'll build enough of a parser to run some simple CREATE, INSERT, and SELECT queries. Then we'll build an in-memory backend supporting TEXT and INT types and write a basic REPL.

在第一篇中，我们将构建解析器来运行一些简单的 CREATE、INSERT 和 SELECT 查询。然后，构建一个支持 TEXT 和 INT 等后端内存的类型，并编写一个基本的 REPL。

We'll be able to support the following interaction:

项目完成后，我们将支持下面的交互：

$ go run *.go
Welcome to gosql.
# CREATE TABLE users (id INT, name TEXT);
ok
# INSERT INTO users VALUES (1, 'Phil');
ok
# SELECT id, name FROM users;
| id | name |
====================
| 1 |  Phil |
ok
# INSERT INTO users VALUES (2, 'Kate');
ok
# SELECT name, id FROM users;
| name | id |
====================
| Phil |  1 |
| Kate |  2 |
ok

The first stage will be to map a SQL source into a list of tokens (lexing). Then we'll call parse functions to find individual SQL statements (such as SELECT). These parse functions will in turn call their own helper functions to find patterns of recursively parseable chunks, keywords, symbols (like parenthesis), identifiers (like a table name), and numeric or string literals.

第一步是将 SQL 源映射为 token 列表(词法分析)。然后我们将调用解析函数来解析单个 SQL 语句(如 SELECT)。这些解析函数将依次调用一些辅助函数，以发现可递归解析的语句块、关键字、符号(如“括号”)、标识符(如“表名称”)和数字或字符串文字。

Then, we'll write an in-memory backend to do operations based on an AST. Finally, we'll write a REPL to accept SQL from a CLI and pass it to the in-memory backend.

然后，我们将编写一个内存后端来执行基于 AST 的操作，最后，我们会编写一个 REPL 来接收命令行下的 SQL 并将其传递到内存后端进行解析。

This post assumes a basic understanding of parsing concepts. We won't skip any code, but also won't go into great detail on why we structure the way we do.

本文是假定你对解析概念有了基本的了解之上的。我们不会跳过任何代码，但也不会详细讨论为何要这样写。

For a simpler introduction to parsing and parsing concepts, see this post on parsing JSON.

有关解析和解析概念的更简单介绍，请参阅该解析 json 文章

lexing

词法分析

The lexer is responsible for finding every distinct group of characters in source code: tokens. This will consist primarily of identifiers, numbers, strings, and symbols.

lexer(词法分析器)负责查找源码中的每一组不同的字符：tokens(令牌)。它主要由标识符、数字、字符串和符号组成。

The gist of the logic will be to iterate over the source string and collect characters until we find a delimiting character such as a space or comma. In this first pass, we'll pretend users don't insert delimiting characters into strings. Once we've reached a delimiting character, we'll "finalize" the token and decide whether it is valid or not.