从需求到数据库架构

最新推荐文章于 2022-01-22 21:45:00 发布

weixin_26752759

最新推荐文章于 2022-01-22 21:45:00 发布

阅读量274

点赞数

文章标签：数据库 python mysql java linux

原文链接：https://medium.com/random-life-journal/from-requirements-to-database-schema-898cb72a9a56

版权

One of the responsibilities of a backend engineer is to keep the data organized and make sure all the operations executed correctly. Most of the time, we will achieve that by DBMS (Database Management System), such as SQLite, MySQL, and PostgreSQL.

后端工程师的职责之一是保持数据井井有条，并确保正确执行所有操作。大多数时候，我们将通过DBMS(数据库管理系统)来实现 ，例如SQLite，MySQL和PostgreSQL。

To keep the data organized, we need a schema or schematic diagram to describe how the DBMS should look like. As the skeleton of the whole data storage system, it will be extremely difficult to change or modify after the services were launched due to the risk of loss and damage to data.

为了使数据井井有条，我们需要一个模式或示意图来描述DBMS的外观。作为整个数据存储系统的骨架，服务启动后，由于丢失和损坏数据的风险，将很难进行更改或修改。

Since the database schema design is such a daunting task, backend engineers will tend to figure out a general process or framework that help us avoid fatal design error and maybe some technical-debt. In the following content, I would try to design the database schema of a simple content platform like Medium.

由于数据库架构设计是一项艰巨的任务，因此后端工程师将倾向于找出通用流程或框架，以帮助我们避免致命的设计错误和某些技术债务。在以下内容中，我将尝试设计一个简单的内容平台(如Medium)的数据库架构。

需求分析 (Requirements Analysis)

What is stored in the database?

数据库中存储了什么？

data.

数据。

Yes, it is surely data, but what do they represent? It looks like a stupid question but I think this is a great entry point to start the database schema design. The reason why we need a database is to store something and make them a kind of resource that can be accessed or operated later.

是的，这肯定是数据，但是它们代表什么呢？它看起来像一个愚蠢的问题，但是我认为这是开始数据库架构设计的一个很好的切入点。我们需要数据库的原因是要存储一些东西并使它们成为一种可以在以后访问或操作的资源。

Now think about the question from another aspect…

现在换个角度思考这个问题……

What resource should we store?

我们应该存储什么资源？

or even retrieve to the layer of the features of your service or product…

甚至检索到您的服务或产品的功能层…

According to the design of our service/product, what kind of resource should we build for our system?

根据服务/产品的设计，我们应该为系统构建什么样的资源？

While cooperating with non-engineer teammates, we often talk about the requirements, not the actual schema. This implies that converting the requirements to resources properly is a key ability, and also the first thing I will do.

在与非工程师团队合作时，我们经常谈论需求，而不是实际的架构。这意味着将需求正确转换为资源是一项关键能力，也是我要做的第一件事。

Almost all the modern services are used-based, which means we need an account to sign in for advanced usage. Although Medium allows guests to read public stories without log in, if you want to clap for the story or follow the author, an account is necessary.

几乎所有现代服务都是基于使用的，这意味着我们需要一个帐户来登录以进行高级使用。尽管Medium允许访客无需登录即可阅读公共故事，但是如果您想为该故事鼓掌或关注作者，则需要一个帐户。

In addition to users, the protagonist of the content platform should be a resource, too. We need storage for all the stories, which makes it possible to load the stories for readers and update stories for authors.

除了用户之外，内容平台的主角也应该是资源。我们需要存储所有故事，这使得可以为读者加载故事并为作者更新故事。

We can also provide tags for each story, and the publications to group up authors in a similar domain. These mechanisms may slightly increase the complexity, but it will significantly improve the efficiency of creating value for both sides of the content platform.

我们还可以为每个故事提供标签，并为类似领域的作者提供出版物。这些机制可能会稍微增加复杂性，但是会显着提高为内容平台双方创造价值的效率。

Now we have sorted out some resources that we need to build for a content platform: users, stories, tags, and publications. Different engineers have different ways to analyze, but I prefer to think from the target object of the operation. It may not the best, but it helps.

现在，我们整理出了我们需要为内容平台构建的一些资源：用户，故事，标签和出版物。不同的工程师有不同的分析方法，但我更喜欢从操作的目标对象出发进行思考。它可能不是最好的，但有帮助。

数据库架构 (Database Schema)

Recall the above content, the database schema is the skeleton of DBMS that defined how the data should be stored. And that is the next issue…

回想一下上面的内容，数据库模式是DBMS的框架，它定义了如何存储数据。那是下一个问题……

How to store the resources we have mentioned above?

如何存储我们上面提到的资源？

I will use Python code and some ORM tools to present the schema because it will take a lot of effort to plot the diagram and modify it step by step, I don’t want to do that.

我将使用Python代码和一些ORM工具来展示该模式，因为绘制该图并一步一步地对其进行修改将花费大量的精力，但我不想这样做。

Here is a beginner-friendly article for the concept of ORM:

这是关于ORM概念的适合初学者的文章：

ER模型 (ER Model)

Entity-Relationship Model shapes a way of thinking that can portrait the database schema more easily, just take all of the resources as entities first:

实体关系模型塑造了一种思维方式，可以更轻松地描述数据库模式，只需首先将所有资源作为实体即可：

Users: hold the profiles of users.
用户：保存用户个人资料。
Stories: hold the content of stories.
故事：掌握故事的内容。
Tags: hold the tags collection for stories.
标签：保存故事的标签集合。
Publications: hold information about publications.
出版物：保存有关出版物的信息。

Next, focus on the relationships of different entities:

接下来，关注不同实体的关系：

Users can follow other users.
用户可以关注其他用户。
Users can follow more than one publication.
用户可以关注多个出版物。
Each story will have one and only one author.
每个故事只有一名作者。
Each story can be marked with more than one tag.
每个故事都可以标记多个标签。
Each story will be included in at most one publication.
每个故事最多将包含在一个出版物中。
Each publication should belong to one owner.
每个出版物应属于一个所有者。
Each publication can group up more than one editor.
每个出版物可以组合多个编辑器。

The function of Medium is far more complicated than what I listed above, I just extract part of it to practice the skill of system design here. With the entities and relationships, schema could be designed more completely by checking each item within the list.

Medium的功能比我上面列出的要复杂得多，我只提取其中的一部分以练习此处的系统设计技能。使用实体和关系，可以通过检查列表中的每个项目来更完整地设计方案。

ORM (ORM)

First, is the user entity that contains the standard profile like name and introduction, and the following mechanism can be achieved by ManyToManyField within Django ORM directly.

首先，是包含诸如名称和简介之类的标准配置文件的用户实体，以下机制可以通过Django ORM中的ManyToManyField直接实现。

But if we only rely on the implicit table create automatically by ManyToManyField, it would be difficult to append a new key to the intermediate table. So I would prefer specifying a table via the “through” parameter, after all, explicit is better than implicit.

但是，如果仅依靠ManyToManyField自动创建的隐式表，则很难向中间表追加新键。因此，我宁愿通过“ through ”参数指定一个表，毕竟显式比隐式好。

The publication entity are similar to the user, they both have name and introduction, but need more attribute to describe the members it grouped up.

发布实体与用户相似，它们都有名称和简介，但是需要更多属性来描述它分组的成员。

There’s only one owner who owns the publication, and multiple editors authorized to publish or remove stories. In fact, Medium has another role called “writer”, who can submit their story to the publication, but I am not concern about that now, two different roles are already enough to represent the advantage of the intermediary table.

该出版物只有一个拥有者，并且授权多个编辑者发布或删除故事。实际上，Medium还具有另一个称为“作家”的角色，可以将其故事提交给出版物，但是我现在并不担心，两个不同的角色已经足以代表中介表的优势。

Even though the owner and editors are all the relationships from the entity of publication to the user, but they are different roles with different authorities. And one of the solutions is to append another key “level” to specify the attribute, it should be one of the roles of owner and editor.

尽管所有者和编辑者都是从发布实体到用户的所有关系，但它们是具有不同权限的不同角色。解决方案之一是附加另一个关键的“级别”以指定属性，它应该是所有者和编辑者的角色之一。

Finally, it’s the entities of the story and the tag. Each story must have an author who is just the user publish the story, but it might not need to be adopted into a publication, so we should allow NULL for this attribute.

最后，它是故事和标签的实体。每个故事都必须有一个作者，而该作者只是用户发布该故事，但是它可能不需要被发布所采用，因此我们应该为此属性允许NULL。

Since the recommended system is not smart enough to feed us the articles we interested in precisely only depend on the title and the content, tags are really a great solution.

由于推荐的系统不够智能，无法向我们提供我们仅对标题和内容感兴趣的文章，因此标签确实是一个很好的解决方案。

We spend a few seconds to mark the story with several keywords and make the direct-matching algorithm become feasible, this is the consensus by cognition of all platform users.

我们花了几秒钟用几个关键字来标记故事，并使直接匹配算法变得可行，这是所有平台用户的共识。

结论 (Conclusion)

The above is the brief steps that how I convert the requirements to a database schema, and could be organized into general sections below:

上面是将需求转换为数据库架构的简短步骤，可以将其组织为以下常规部分：

- Requirements Analysis
    - Resources for service or products
    - Resources to access and operate- Database Schema
    - Entities-Relationships Model
    - Django ORM or other implemtaion and management tool you prefer
    - Test the behaviours of your database

hope it helps you :)

希望它对您有帮助:)

翻译自: https://medium.com/random-life-journal/from-requirements-to-database-schema-898cb72a9a56

weixin_26752759

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
从需求到数据库架构

One of the responsibilities of a backend engineer is to keep the data organized and make sure all the operations executed correctly. Most of the time, we will achieve that by DBMS (Database Management...
复制链接

扫一扫