数据迁移测试_自动化数据迁移测试

数据迁移测试

Data migrations are notoriously difficult to test. They take a long time to run on large datasets. They often involve heavy, inflexible database engines. And they’re only meant to run once, so people think it’s throw-away code, and therefore not worth writing tests for. Not so.

众所周知,数据迁移很难测试。 他们需要很长时间才能在大型数据集上运行。 它们通常涉及笨重,僵化的数据库引擎。 而且它们只打算运行一次,所以人们认为它是一次性的代码,因此不值得编写测试。 不是这样

With a few modern tools, we can test migrations just like we do our production code. Let me show you one approach.

使用一些现代工具,我们可以像测试生产代码一样测试迁移。 让我向您展示一种方法。

鸟瞰 (Bird’s-eye view)

We’ll be moving data from a PostgreSQL database to MongoDB using Node.js. Obviously, you can substitute any of these. Anything that runs in Docker, like PostgreSQL, or in-memory, like MongoDB, will fit the approach. Also, any programming language is fine, if you’re comfortable with it. I chose Node.js for the plasticity it offers, which is useful to have in data transformations.

我们将使用Node.js将数据从PostgreSQL数据库移动到MongoDB 。 显然,您可以替换其中任何一个。 任何在Docker(例如PostgreSQL)或内存(例如MongoDB)中运行的东西都适合该方法。 另外,如果您愿意的话,任何编程语言都可以。 我选择Node.js是因为它具有可塑性,在数据转换中非常有用。

We’ll split the migration into several functions, organized around data scopes (like companies, users, whatever you have in the database), each function receiving an instance of the PostgreSQL and MongoDB client, i.e. migrateUsers(postgresClient, mongoClient). This way, our test code can provide connections to the test databases, and production code to production ones.

我们将迁移分为几个函数,这些函数围绕数据范围(例如公司,用户,无论您在数据库中拥有什么)进行组织,每个函数接收PostgreSQL和MongoDB客户端的实例,即migrateUsers(postgresClient, mongoClient) 。 这样,我们的测试代码可以提供到测试数据库的连接,而生产代码则可以提供到生产数据库的连接。

Each migration function will have its accompanying test suite, i.e. migrate-users.js will have migrate-users.specs.js.

每个迁移功能都有其随附的测试套件,即migrate-users.js将具有migrate-users.specs.js

         /-> migrateCompanies()  <-- migrate-companies.spec.js
index.js --> migrateUsers() <-- migrate-users.spec.js
\-> migrateFizzBuzzes() <-- migrate-fizzBuss.spec.js

The test suites will also share a file we’ll call test-init.js containing code that’ll launch both test databases and connect clients, then shut them down after tests finish.

测试套件还将共享一个名为test-init.js的文件,其中包含将启动两个测试数据库并连接客户端,然后在测试完成后将其关闭的代码。

The article comes with an accompanying GitHub repository, containing all the code we’re discussing here. I’ll solely be quoting excerpts to illustrate key concepts, and you may prefer to follow them by studying the full source code.

本文带有随附的GitHub存储库 ,其中包含我们在此处讨论的所有代码。 我仅引用摘录来说明关键概念,并且您可能更喜欢通过研究完整的源代码来遵循它们。

安装工具 (Install the tools)

Install Docker Engine with Docker Compose for your platform. Make sure you can run the docker command without requiring user rights elevation (i.e. without sudo). On Linux, this is typically achieved by adding your current user to the docker group:

在您的平台上安装带有Docker Compose的 Docker引擎 。 确保您可以运行docker命令而无需提升用户权限(即,无需sudo )。 在Linux上,通常是通过将当前用户添加到docker组来实现的:

sudo usermod -aG docker $USER

after which you’ll need to log out and back in again.

之后,您需要注销并重新登录。

You’ll also need Node.js.

您还需要Node.js。

Start a new npm project by going through the npm init steps, then install the dependencies we’ll be using:

通过执行npm init步骤开始一个新的npm项目,然后安装我们将使用的依赖项:

npm install -E mongodb pg
npm install -ED jest mongodb-memory-server shelljs

We’ll use jest to run the tests, but again, go with whatever suits your needs.

我们将使用jest来运行测试,但同样,请根据您的需要进行选择。

配置工具 (Configure the tools)

We’ll be running PostgreSQL in Docker, so let’s first configure that. Create a file called docker-compose.yml and paste in the following:

我们将在Docker中运行PostgreSQL,因此我们首先对其进行配置。 创建一个名为 docker-compose.yml文件,然后粘贴以下内容:

version: '3'
services
:
postgresql:
image: postgres:12 #1
container_name: postgresql-migration-test
environment:
POSTGRES_PASSWORD: 'password' #2
ports
:
- "5432" #3

This will configure a Docker container with the latest PostgreSQL 12.x (1), default user and password for password (2). (Never use that outside of testing, obviously.) The above also says that PostgreSQL’s standard port 5432 will be exposed to a random port on the host (3). Thanks to that, you can safely launch the container on a host machine where PostgreSQL is already running, thus avoiding test failures due to port conflicts. We’ll have a way to find the right port to connect to in tests.

这将配置与码头工人,容器最新的PostgreSQL12.x (1),默认用户名和password的密码(2)。 (显然,切勿在测试之外使用该端口。)上面的内容还表明PostgreSQL标准端口5432将暴露给主机上的随机端口(3)。 因此,您可以在已经运行PostgreSQL主机上安全启动容器,从而避免由于端口冲突而导致测试失败。 我们将提供一种方法来找到测试中要连接的正确端口。

Now, if you run the command docker-compose up in the same directory, the container with the database should start and display PostgreSQL’s standard output. If it did, great; shut it down with Ctrl+c, followed by docker-compose down and we’ll configure the test suite to do the starting and stopping later.

现在,如果在同一目录中运行命令docker-compose up ,则包含数据库的容器应启动并显示PostgreSQL标准输出。 如果可以,那就太好了; 使用Ctrl+c其关闭,然后使用docker-compose down将其docker-compose down ,我们将配置测试套件以稍后进行启动和停止。

Move on to your package.json and add the tasks we’ll be using into the scripts section:

转到package.json并将我们将要使用的任务添加到scripts部分:

"scripts": {
"start": "node src/index.js",
"test": "jest --runInBand" //1},

With that, npm start will run your migration and npm test will run the tests.

这样, npm start将运行您的迁移,而npm test将运行测试。

Notice the --runInBand parameter for jest (1). That tells jest to run every test suite in sequence, whereas by default these run in parallel. This is important, because all test suites share the same database instances and running them simultaneously would cause data conflicts. Don’t worry about speed, though — we’re not dealing with a lot of data here, so tests will still be fast.

注意jest (1)的--runInBand参数。 这告诉jest 依次运行每个测试套件,而默认情况下,它们并行运行。 这很重要,因为所有测试套件都共享相同的数据库实例,并且同时运行它们会导致数据冲突。 不过,请不要担心速度-这里我们不会处理大量数据,因此测试仍将很快。

jest itself also needs a bit of configuration, so add a file called jest.config.js with the following content:

jest本身也需要一些配置,因此添加一个名为jest.config.js的文件,其内容如下:

module.exports = {
testEnvironment: "jest-environment-node",
moduleFileExtensions: ["js"],
testRegex: ".spec.js$",
}

This simply tells jest to treat any file ending with .spec.js as a test suite to run, and to use its node environment (as opposed to the default jsdom useful for code meant to run in the browser).

这只是简单地告诉jest将以.spec.js结尾的任何文件视为要运行的测试套件,并使用其node环境(与默认的jsdom对于打算在浏览器中运行的代码有用)相对。

Ready. Let’s start testing.

准备。 让我们开始测试。

设置测试环境 (Set up the test environment)

We’ll start with creating the src/test-init.js file, which will deal with database startup and teardown, and will be included by every test suite. It’s a crude method, I know, and jest claims to have better ways, but I’ll exclude them from this text for clarity.

我们将从创建src/test-init.js文件开始,该文件将处理数据库的启动和拆卸,并将包含在每个测试套件中。 我知道这是一种粗略的方法,并且jest 声称有更好的方法 ,但是为了清楚起见,我将从本文中排除它们。

const shell = require("shelljs");const {MongoMemoryServer} = require("mongodb-memory-server");const {getPostgresClient} = require("./postgres-client-builder");const {afterAll, afterEach, beforeAll} = require("@jest/globals");const {getMongoClient} = require("./mongo-client-builder");let clients = {};   //1let mongod;const dockerComposeFile = __dirname + '/../docker-compose.yml';beforeAll(async () => {
console.log('Starting in memory Mongo server');
mongod = new MongoMemoryServer({autoStart: false}); //2
await mongod.start();
const mongoUri = await mongod.getUri();
clients.mongo = await getMongoClient(mongoUri); //3
clients.mongoDb = clients.mongo.db();
console.log('Starting Docker container with PostgreSQL');
shell.exec(`docker-compose -f ${dockerComposeFile} up -d`); //4
await new Promise(resolve => setTimeout(resolve, 1000)); //5
const postgresHost = shell.exec(`docker-compose -f ${dockerComposeFile} port postgresql 5432`).stdout; //6
const postgresUri = `postgresql://postgres:password@${postgresHost}`;
clients.postgres = await getPostgresClient(postgresUri); //7
});afterAll(async () => {
await clients.mongo.close(); //8
await clients.postgres.end();
await mongod.stop();
console.log('Mongo server stopped')
shell.exec(`docker-compose -f ${dockerComposeFile} down`); //9
console.log('PostgreSQL Docker container stopped');
});module.exports = {
clients,
}

Let’s go through this step by step. First, we’re declaring and exporting the variable clients (1) which will allow us to pass database clients to the test suites.

让我们逐步进行此步骤。 首先,我们声明并导出变量clients (1),这将使我们能够将数据库客户端传递给测试套件。

beforeAll is all about starting the databases. The first part starts MongoDB and follows standard instructions from the mongodb-memory-server package (2). There’s a call to a custom function getMongoClient()(3) here that I left out which simply builds a MongoClient instance from the mongodb package.

beforeAll与启动数据库有关。 第一部分启动MongoDB,并遵循mongodb-memory-server 软件包 (2)中的标准说明。 我在这里忽略了对自定义函数getMongoClient() (3)的调用,该函数getMongoClient() mongodb 构建 MongoClient 实例

Then it gets interesting. shell allows us to run shell commands, and we’re simply calling docker-compose up (4) to start the PostgreSQL container we configured earlier. That’s followed by a one-second wait (5) to let the database start accepting connections on the port. Next, we need to see what port on the host machine was assigned to the container. That’s what docker-compose port does (6) and it also outputs the local IP address, i.e. 0.0.0.0:12345 so we can store that directly in postgresUri and use another custom function getPostgresClient() (7) to build the PostgreSQL client instance.

然后变得有趣。 shell允许我们运行shell命令,而我们只是简单地调用docker-compose up (4)来启动我们之前配置的PostgreSQL容器。 接下来是等待一秒钟(5),以使数据库开始接受端口上的连接。 接下来,我们需要查看主机上的哪个端口已分配给容器。 这就是docker-compose port作用(6),它还输出本地IP地址,即0.0.0.0:12345因此我们可以将其直接存储在postgresUri并使用另一个自定义函数getPostgresClient() (7)来构建PostgreSQL客户端实例

You’ll notice I use two extra options for docker-compose: -f points to the docker-compose.yml file which is in the parent directory of src/test-init.js and -d runs the containers as daemons, so they’re not holding up the tests.

您会注意到我对docker-compose使用了两个额外的选项: -f指向src/test-init.js的父目录中docker-compose.yml文件,而-d将容器作为守护程序运行,因此它们没有阻止测试。

afterAll reverses the process. First, we disconnect the clients from the databases (8). Then we stop MongoDB and finally shut down the PostgreSQL Docker container with docker-compose down (9).

afterAll撤消该过程。 首先,我们将客户端与数据库断开连接(8)。 然后我们停止MongoDB,最后通过docker-compose down (9)关闭PostgreSQL Docker容器。

The src/test-init.js file needs one more function that will clean both databases after each test.

src/test-init.js文件还需要一个函数,该函数将在每次测试后清除两个数据库。

afterEach(async () => {
const tables = await clients.postgres.query(
`SELECT table_name FROM information_schema.tables WHERE table_schema ='public';`
);
await Promise.all(tables.rows.map(row => clients.postgres.query(`DROP TABLE ${row.table_name}`)));
const collections = await clients.mongoDb.collections();
await Promise.all(collections.map(collection => clients.mongoDb.dropCollection(collection.collectionName)));
});

We’re simply dropping all tables from PostgreSQL and collections from MongoDB, leaving a clean slate for the next test. You can truncate the tables instead, so you won’t need to recreate them before each test. Either way is fine.

我们只是删除了PostgreSQL中的所有表和MongoDB中的集合,为下一个测试留出了一块空白。 您可以截断表,因此无需在每次测试之前重新创建它们。 两种方法都可以。

The code in src/test-init.js is the magic sauce of the solution. It’ll ensure that there are two, real databases running for your migration code to operate on, make sure they’re clean for each test case, and shut them down after all the tests finish running.

src/test-init.js的代码是该解决方案神奇之处。 它将确保有两个真实的数据库正在运行,以供您的迁移代码运行,确保它们对于每个测试用例都是干净的,并在所有测试完成后关闭它们。

创建测试套件 (Create the test suite)

We’re ready to start writing test code. Our actual migration code will be split into functions for each data type. We could be migrating user accounts, so we’ll have a file called src/migrate-users.js and inside:

我们准备开始编写测试代码。 我们的实际迁移代码将分为每种数据类型的函数。 我们可能正在迁移用户帐户,因此我们将在其中包含一个名为src/migrate-users.js的文件:

async function migrateUsers(postgresClient, mongoClient) {
// Migration logic goes here
}

We’ll be calling this function in our test suite at src/migrate-users.spec.js.

我们将在测试套件中的src/migrate-users.spec.js上调用此函数。

Start with a beforeEach() call to create the necessary tables (because we’re dropping them; if you opted to truncate them, you can skip this step):

beforeEach()调用开始以创建必要的表(因为我们正在删除它们;如果您选择截断它们,则可以跳过此步骤):

beforeEach(async () => {
await clients.postgres.query(`CREATE TABLE users (id integer, username varchar(20))`);
});

We’re not adding any indexes or constraints, as we’ll be dealing with small amounts of strictly controlled data. That said, your specific use case may warrant it.

我们不会添加任何索引或约束,因为我们将处理少量严格控制的数据。 就是说,您的特定用例可能会得到保证。

Now, we’ll write the actual test case:

现在,我们将编写实际的测试用例:

it('migrates one user', async () => {
await clients.postgres.query(`
INSERT INTO users (id, username)
VALUES (1, 'john_doe')
`
);
await migrateUsers(clients.postgres, clients.mongo);
const users = await clients.mongoDb.collection('users').find().toArray();
expect(users).toHaveLength(1);
expect(users[0].username).toEqual('john_doe');
});

It looks simple and it is. We’re inserting a user into the source database (PostgreSQL), then calling the migrateUsers() function, passing in the database clients connected to our test databases, then fetching all users from the target database (MongoDB), verifying that exactly one was migrated and that its username value is what we’re expecting.

看起来很简单,确实如此。 我们正在将一个用户插入源数据库(PostgreSQL),然后调用migrateUsers()函数,传入连接到我们测试数据库的数据库客户端,然后从目标数据库(MongoDB)中获取所有用户,并验证是否确实有一个已迁移,其username值正是我们所期望的。

Try running the test, either with an npm test or from within your favorite IDE.

尝试通过npm test或从您喜欢的IDE中npm test

You can continue building out your test suite, adding more data, conditions and cases. Some of the items you should test for, are:

您可以继续构建测试套件,添加更多数据,条件和案例。 您应该测试的一些项目是:

  • data transformations — if you’re mapping or changing data,

    数据转换-如果您要映射或更改数据,
  • missing or incorrect data — make sure your code expects and handles these,

    丢失或不正确的数据-确保您的代码期望并处理这些数据,
  • data scope — check that you’re only migrating the records you want,

    数据范围-检查您是否仅在迁移所需的记录,
  • joins — if you’re putting together data from several tables, what if there is no match? What if there are multiple matches?

    连接-如果将多个表中的数据放在一起,如果不匹配怎么办? 如果有多个匹配项怎么办?
  • timestamps & time zones — when you’re moving these between tables, make sure they still describe the same point in time.

    时间戳和时区-在表之间移动时,请确保它们仍描述相同的时间点。

运行实际的迁移 (Run the actual migration)

You can test-drive and build your whole migration code with the above setup, but eventually you’ll want to run it on actual data. I suggest you try that early on a copy of the database you’ll be migrating, because that’ll uncover conditions you haven’t considered, and will inspire more test cases.

您可以使用上述设置测试驱动并构建您的整个迁移代码,但最终您将需要在实际数据上运行它。 我建议您尽早在要迁移的数据库副本尝试一下 ,因为这将发现您尚未考虑的条件,并会激发更多的测试用例。

You’ll have several migration functions for various data scopes, so what you’ll need now is the src/index.js file to call them in sequence:

您将具有针对各种数据范围的几种迁移功能,因此现在需要的是src/index.js文件,以按顺序调用它们:

run().then().catch(console.error);async function run() {
const postgresClient = await buildPostgresClient();
const mongoClient = await buildMongoClient();
console.log('Starting migration');
await migrateCompanies(postgresClient, mongoClient);
await migrateUsers(postgresClient, mongoClient);
await migrateFizzBuzzes(postgresClient, mongoClient);
console.log(`Migration finished`);
}

And you can run the actual migration with npm start.

您可以使用npm start运行实际的迁移。

建立测试套件 (Build out the test suite)

Your test suite will quickly grow in scope and complexity, so here are a few more ideas to help you on your way:

您的测试套件的范围和复杂性将Swift增长,因此,这里有一些其他想法可以帮助您:

  • docker-compose.yml can be extended with more services, if you need them: additional databases, external services, etc. They’ll all launch together via the same docker-compose up command.

    如果需要,可以使用更多服务扩展docker-compose.yml :其他数据库,外部服务等。它们都将通过同一docker-compose up命令一起启动。

  • You’ll be creating the same tables in several test suites, so write a shared function for that, i.e. createUsersTable() and call it inside your beforeEach() block.

    您将在多个测试套件中创建相同的表,因此为此编写一个共享函数,即createUsersTable()并在beforeEach()块内调用它。

  • You will also be inserting a lot of test data, often focusing on one or more fields. It’s useful to have generic, shared functions for inserting test records, with the ability to override specific fields, like the one below. Then you can call it in your tests with await givenUser({username: "whatever"}) .

    您还将插入大量测试数据,通常侧重于一个或多个字段。 具有通用的,插入测试记录的共享功能,并且具有覆盖特定字段(如下面的字段)的功能非常有用。 然后,可以在测试中使用await givenUser({username: "whatever"})来调用它。

async function givenUser(overrides = {}) {
const user = Object.assign({
id: '1',
username: 'john_doe',
}, overrides);
await clients.postgres.query(`
INSERT INTO users (id, username)
VALUES (
${user.id}, '${user.username}')
`
);
return user;
}
  • You may need to make your test record IDs unique. Sometimes that’s easy, like with MongoDB, where calling new ObjectId() generates a unique value, or using an integer sequence; sometimes you can use packages like uuid; at other times, you’ll have to write a simple function that generates an ID in the format you require. Here’s one I used, when I needed unique 18-character IDs that were also sortable in the order they were generated:

    您可能需要使测试记录ID唯一。 有时候,这很容易,例如在MongoDB中,在其中调用new ObjectId()生成一个唯一值或使用整数序列。 有时您可以使用uuid之类的包; 在其他时候,您将必须编写一个简单的函数,以所需的格式生成ID。 当需要唯一的18个字符的ID(它们也可以按照生成顺序排序)时,这就是我使用的一个:

let lastSuffix = 0;function uniqueId() {
return new Date().toISOString().replace(/\D/g, '') + (lastSuffix++ % 10);
}

查看样本存储库 (Check out the sample repository)

All of the code quoted above is available in a GitHub repository at @Pragmatists/data-migration-testing. You can pull it, run npm install and, with a bit of luck, it should run out of the box. You can also use it as a base for your own data migration suite. There’s no license attached — it’s public domain.

上面引用的所有代码均可在@ Pragmatists / data-migration-testing的GitHub存储库中找到。 您可以将其拉出,运行npm install ,但幸运的是,它应该立即可用。 您也可以将其用作自己的数据迁移套件的基础。 没有附加的许可证-它是公共领域。

翻译自: https://blog.pragmatists.com/automating-data-migration-testing-db721c34ed09

数据迁移测试

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值