使用tSQLt的测试驱动数据仓库开发(TDWD)的概念

This is a conceptual article consisting of two parts with enough supported material for any data professional or enthusiast with databases or data warehouse development background willing to go a step ahead by using an industry-recognized SQL unit testing framework called tSQLt.

这是一篇概念性文章,由两部分组成,并为拥有数据库或数据仓库开发背景的任何数据专业人员或数据爱好者提供了足够的支持材料,他们愿意使用业界公认SQL单元测试框架tSQLt向前迈进。

This article also highlights the importance of a not so well known innovative development strategy called Proof of Concept used by specialists (including programmers) to test a new idea or concept by developing the model rather than the actual solution.

这篇文章还强调称为概念验证一个不那么知名的创新发展战略的重要性,使用专家(包括程序员)的开发模式,而不是实际的解决方案,以测试新的想法或概念。

Additionally, the readers of this article are going to get familiar with the method of mapping existing database development practices to a basic data warehouse business intelligence solution using tSQLt.

此外,本文的读者将熟悉使用tSQLt将现有数据库开发实践映射到基本数据仓库商业智能解决方案的方法。

概念(概念证明) (The Concept (Proof of Concept))

Let us begin our journey with the concept or the proof of concept.

让我们从概念或概念证明开始我们的旅程。

什么是概念证明 (What is proof of concept )

A proof of concept is like a pilot project, which is all about developing (testing) an idea or concept to check its feasibility or potential to use it as a base to begin actual work if all goes well.

概念证明就像一个试点项目,其目的在于开发(测试)一个想法或概念,以检查其可行性或潜力,如果一切顺利的话,可以将其用作开始实际工作的基础。

例子1 (Example 1)

One of the most interesting examples of proof of concept is a concept car that is specifically designed to test new technology or design, which may or may not become part of commercial manufacturing to be available for the general public.

概念证明中最有趣的例子之一是专门设计用来测试新技术或新设计的概念车,新技术或新设计可能会或可能不会成为可供公众使用的商业制造的一部分。

例子2 (Example 2)

In the software world, proof of concept is whether the new idea of developing your software is workable or not.

在软件世界中,概念证明是开发软件的新想法是否可行。

例子3 (Example 3)

In the database or data warehouse world, we can say the proof of concept is to check if the newly proposed way of developing a database or data warehouse is feasible or not.

在数据库或数据仓库世界中,我们可以说概念证明是检查新提议的开发数据库或数据仓库的方法是否可行。

例子4 (Example 4 )

The proof of concept can also be whether the new database development strategy works or test the new way of using existing database development and testing tools (such as tSQLt) to see if this is more productive for the development and testing team and business or not.

概念证明还可以是新的数据库开发策略是否有效,还是测试使用现有数据库开发和测试工具(例如tSQLt)的新方式,以查看这对于开发和测试团队及业务是否更有生产力。

概念验证策略 (Proof of Concept strategies)

There are a number of ways called strategies that can be used as a basis to develop or test the proof of concept. Let us have a look at a few of them.

有许多种称为策略的方法可以用作开发或测试概念证明的基础。 让我们看看其中的一些。

逆向工程策略 (Reverse Engineering Strategy)

This strategy is based on reverse engineering of an existing process to improve it further.

该策略基于现有流程的逆向工程以进一步改进它。

映射策略 (Mapping Strategy)

This is the strategy used in most common scenarios where the developers map the existing process to a newly proposed process in proof of concept.

这是在大多数常见情况下使用的策略,在这种情况下,开发人员将现有流程映射到新提出的流程以证明概念。

映射策略示例1
(Mapping Strategy Example 1
)

One of the examples of mapping strategies, in general, is, think of some database development scenarios when we map requirements to the stored procedures such that a business requirement received by a development team is generally met by writing a stored procedure to be tested by SQL unit testing framework tSQLt which then satisfies the business requirement when it runs successfully.

通常,映射策略的示例之一是,当我们将需求映射到存储过程时,会想到一些数据库开发方案,从而通常通过编写要由SQL测试的存储过程来满足开发团队收到的业务需求。单元测试框架tSQLt在成功运行时可以满足业务需求。

映射策略示例2 (Mapping Strategy Example 2)

Another good example of mapping strategy is in the context of Data Warehouse business intelligence solutions when business requirements are mapped to data models, which are then exposed to internal and external business users to be used for analysis and reporting.

映射策略的另一个很好的例子是在数据仓库业务智能解决方案的上下文中,当业务需求被映射到数据模型时,数据模型随后暴露给内部和外部业务用户以用于分析和报告。

兴趣点 (Point of Interest)

An interesting point to note is that we are using a highly productive and commercially in use SQL unit testing framework tSQLt to work on our conceptual model, thereby leaving enough room for implementation, as mentioned in the introductory passage of this article.

需要注意的有趣一点是,正如本文介绍性文章中所提到的,我们正在使用一种高效且商业化使用SQL单元测试框架tSQLt来处理我们的概念模型,从而为实现留出了足够的空间。

测试驱动的数据仓库开发(TDWD) (Test-Driven Data Warehouse Development (TDWD))

Let us now focus on the test-driven data warehouse development with tSQLt proof of concept by first designing a toolkit also serving as pre-requisites.

现在,让我们通过首先设计一个还作为先决条件的工具箱,来专注于使用tSQLt概念证明进行测试驱动的数据仓库开发。

测试驱动的数据仓库开发工具套件 (Test-driven data warehouse development Tool Kit)

The following are required to work on test-driven data warehouse development using tSQLt proof of concept:

使用tSQLt概念验证来进行测试驱动的数据仓库开发需要满足以下条件:

  1. SQL Database and Data Warehouse concepts and understanding

    SQL数据库和数据仓库的概念和理解

  2. Strong T-SQL skills

    强大的T-SQL技能
  3. SQL unit testing using tSQLt skills

    使用tSQLt技能进行SQL单元测试

  4. Test-driven database development concepts and implementation

    测试驱动的数据库开发概念和实现

使命宣言 (Mission Statement)

Evaluating Test-driven data warehouse development with tSQLt proof of concept is based on Test-driven database development methodology.

使用tSQLt概念证明评估测试驱动的数据仓库开发是基于测试驱动的数据库开发方法的。

范围 (Scope)

It is very important to define the scope of the work, and in our case, the scope of the proof of concept is limited to the following essential processes of a traditional data warehouse:

定义工作范围非常重要,在我们的案例中,概念证明的范围限于传统数据仓库的以下基本过程:

  1. Staging extracts

    分期提取
  2. Transform-Loads

    变换负载

They can also be represented by ETL workflows where E stands for extract(s), and TL stands for Transform-Load(s).

它们也可以由ETL工作流表示,其中E代表提取,而TL代表变换负载。

If you would like to know more staging extracts, please refer to SQL Unit Testing Data Warehouse Extracts with tSQLt.

如果您想了解更多的阶段提取,请参阅使用tSQLtSQL单元测试数据仓库提取

In simple words, a typical data warehouse environment from the data point of view mainly consists of two phases:

简而言之,从数据角度来看,典型的数据仓库环境主要包括两个阶段:

  1. Staging environment

    暂存环境
  2. BI (business intelligence) environment

    BI(商业智能)环境

In the first phase, data is copied from source to staging environment and then from the staging environment to the BI environment.

在第一阶段,数据从源复制到暂存环境,然后从暂存环境复制到BI环境。

Typical Data Warehouse Business Intelligence Solution Architecture

A polite reminder for the beginners to please at this point not to confuse ETL (Extract Transform Load) with ELT (Extract Load Transform) since they are both data movement strategies, and both can work equally with data warehouse solution.

谨在此提醒初学者,不要将ETL(提取转换负载)与ELT(提取负载转换)混淆,因为它们都是数据移动策略,并且都可以与数据仓库解决方案同等地工作。

映射测试驱动的数据库开发 (Mapping test-driven database development )

We are going to use test-driven database development as a reference to work on test-driven data warehouse development with tSQLt proof of concept.

我们将使用测试驱动的数据库开发作为参考,以tSQLt概念证明进行测试驱动的数据仓库开发。

设置沙箱(环境) (Set up a sandbox (environment))

The first thing in order to start working on the proof of concept is to set up your sandbox environment, which is going to serve as your mini-research lab.

要开始进行概念验证的第一件事是设置您的沙盒环境,它将用作您的小型研究实验室。

设置样本源数据库
(Set up sample source database
)

We need to set up a sample database called SQLDevArticlesV5 which is going to serve as a source by running the following T-SQL script against any dev environment:

我们需要建立一个名为SQLDevArticlesV5的示例数据库,该数据库将通过在任何开发环境中运行以下T-SQL脚本作为源:

-- 1 Create SQLDevArticlesV5 source database
CREATE DATABASE SQLDevArticlesV5;
GO
 
USE SQLDevArticlesV5;
GO
 
-- 2 Create author table
CREATE TABLE [dbo].[Author] (
    [AuthorId]         INT           IDENTITY (1, 1) NOT NULL,
    [Name]             VARCHAR (40)  NOT NULL,
    [RegistrationDate] DATETIME2 (7) NULL
);
 
-- 3 Create article tables
CREATE TABLE [dbo].[Article] (
    [ArticleId]      INT           IDENTITY (1, 1) NOT NULL,
    [Title]          VARCHAR (300) NOT NULL,
    [Published_Date] DATETIME2 (7) NOT NULL
);
 
-- 4 Populate author table
SET IDENTITY_INSERT [dbo].[Author] ON
INSERT INTO [dbo].[Author] ([AuthorId], [Name], [RegistrationDate]) VALUES (1, N'Abid', N'2019-01-01 00:00:00')
INSERT INTO [dbo].[Author] ([AuthorId], [Name], [RegistrationDate]) VALUES (2, N'Zia', N'2019-02-01 00:00:00')
INSERT INTO [dbo].[Author] ([AuthorId], [Name], [RegistrationDate]) VALUES (3, N'Robin', N'2019-03-02 00:00:00')
INSERT INTO [dbo].[Author] ([AuthorId], [Name], [RegistrationDate]) VALUES (4, N'Mehta', N'2019-04-02 00:00:00')
INSERT INTO [dbo].[Author] ([AuthorId], [Name], [RegistrationDate]) VALUES (5, N'Florence', N'2020-01-01 00:00:00')
SET IDENTITY_INSERT [dbo].[Author] OFF
 
-- 5 Populate article table
SET IDENTITY_INSERT [dbo].[Article] ON
INSERT INTO [dbo].[Article] ([ArticleId], [Title], [Published_Date]) VALUES (1, N'Database Analysis with T-SQL', N'2018-01-02 00:00:00')
INSERT INTO [dbo].[Article] ([ArticleId], [Title], [Published_Date]) VALUES (2, N'Designing Data Models for Reporting', N'2018-01-03 00:00:00')
INSERT INTO [dbo].[Article] ([ArticleId], [Title], [Published_Date]) VALUES (3, N'SQL Database Development Guide', N'2018-02-02 00:00:00')
INSERT INTO [dbo].[Article] ([ArticleId], [Title], [Published_Date]) VALUES (4, N'Build your first Azure SQL Database', N'2018-03-02 00:00:00')
INSERT INTO [dbo].[Article] ([ArticleId], [Title], [Published_Date]) VALUES (5, N'Fundamentals of T-SQL', N'2019-01-02 00:00:00')
SET IDENTITY_INSERT [dbo].[Article] OFF;
GO

Please run the following script to check your source sample database:

请运行以下脚本来检查您的源示例数据库:

-- View all authors (table)
SELECT a.AuthorId,a.Name,a.RegistrationDate FROM Author a

Authors tabular data

设置样本数据仓库
(Set up a sample data warehouse
)

Create a sample database warehouse database called SQLArticlesV5DW as follows:

创建一个名为SQLArticlesV5DW的示例数据库仓库数据库,如下所示:

-- 1 Create SQLDevArticlesV5DW database
CREATE DATABASE SQLDevArticlesV5DW;
GO
 
USE SQLDevArticlesV5DW;
GO
 
-- 2 Create Staging schema
CREATE SCHEMA Staging
AUTHORIZATION dbo;
GO
 
 
-- 2 Create staging author table
CREATE TABLE [Staging].[Author] (
    [AuthorId]         INT           IDENTITY (1, 1) NOT NULL,
    [Name]             VARCHAR (40)  NOT NULL,
    [RegistrationDate] DATETIME2 (7) NULL
);
 
-- 3 Create article tables
CREATE TABLE [Staging].[Article] (
    [ArticleId]      INT           IDENTITY (1, 1) NOT NULL,
    [Title]          VARCHAR (300) NOT NULL,
    [Published_Date] DATETIME2 (7) NOT NULL
);

将tSQLt添加到源数据库和数据仓库
(Add tSQLt to the source database and data warehouse
)

The next thing is to download and install the tSQLt (SQL unit testing framework) for both source and data warehouse databases from the tsqlt.org.

接下来的事情就是下载和从源和数据仓库数据库安装tSQLt(SQL单元测试框架) tsqlt.org

Please follow the instructions in the article Conventional SQL Unit Testing with tSQLt in Simple Words to install tSQLt if you have not installed it before.

请按照简单单词中的使用tSQLt进行常规SQL单元测试中的说明进行操作 如果您以前没有安装过tSQLt,请先安装它。

The following output upon running the tSQLt.class.sql script indicates that tSQLt has been installed successfully onto your desired database(s):

运行tSQLt.class.sql脚本后的以下输出表明tSQLt已成功安装到所需的数据库上:

tSQLt installed

The sandbox environment to begin the proof of concept with tSQLt is ready to be used if all the above requirements are met.

如果满足以上所有要求,则可以使用以tSQLt开始概念验证的沙箱环境。

多相要求 (Multiphase Requirements )

The requirements from the business point of view can be as follows:

从业务角度来看,要求可以如下:

As a business user, I would like to view strictly tested authors report preferably from a system optimized for reporting and analysis so that further reports can be added to it

作为业务用户,我希望从经过优化的报告和分析系统中查看经过严格测试的作者报告,以便可以向其添加更多报告。

Your business analyst receives the requirements and modifies them for you as follows:

您的业​​务分析师会收到需求,并为您进行如下修改:

The clients are in need of a data warehouse which must be thoroughly tested to be eventually used for multiple reports and analysis beginning with authors report.

客户需要一个数据仓库,该数据仓库必须经过全面测试,才能最终用于多个报告以及从作者报告开始的分析。

You have had a quick chat with your team and come up with a plan as follows:

您与团队进行了快速聊天,并提出了以下计划:

We must try to adopt a test-driven approach to meet these requirements, which means a proof of concept to see test-driven data warehouse development in operation if approved can be used to meet these specific business requirements considering the available time and effort.

我们必须尝试采用一种测试驱动的方法来满足这些要求,这意味着要有一个概念证明,即在考虑到可用时间和精力的情况下,如果批准的话可以看到运行中的测试驱动的数据仓库开发可以满足这些特定的业务要求。

The Proof of Concept journey

At this point, if you are not already familiar with, then please review the test-driven database development concept by going through the article Fundamentals of Test-Driven Database Development (TDDD) with tSQLt unit testing.

在这一点上,如果您还不熟悉,那么请阅读带有tSQLt单元测试的测试驱动数据库开发基础知识(TDDD)一文,以复习测试驱动数据库开发的概念。

映射潜在对象TDDD (Mapping potential object TDDD)

Just like test-driven database development where we assume a database object is going to meet the requirements and testing that object with a SQL unit testing framework such as tSQLt can help us even create that object properly since it has to pass to prove that it can meet the requirements.

就像测试驱动的数据库开发一样,我们假设数据库对象将满足要求,并使用SQL单元测试框架(例如tSQLt)测试该对象可以帮助我们甚至正确创建该对象,因为必须通过它来证明它可以符合要求。

However, the data warehouse is a bit more complex as compared to a database where we can start with a potential object by creating a SQL unit test to check its functionality, which has to meet the specification, and this effort ultimately helps us to build the requirement specific object without over-engineering.

但是,与数据库相比,数据仓库要稍微复杂一些,在数据库中,我们可以通过创建一个SQL单元测试来检查其功能(必须满足规范)来从潜在对象入手,最终这将帮助我们构建潜在的对象。需求特定的对象,而无需过度设计。

Yes, we can think of an object called AuthorsReport as in the case of test-driven database development, but then how does it fit in a data warehouse scenario and establishing the proof of concept is the most important question to move forward.

是的,在测试驱动的数据库开发中,我们可以想到一个名为AuthorsReport的对象,但是该对象如何适合数据仓库场景并建立概念证明是前进的最重要问题。

选择AuthorsReport对象 (Choosing AuthorsReport Object )

There is no problem with choosing the same object in test-driven data warehouse development as we would choose in test-driven database development; however, the logic required to build this object has to go through a couple of more steps.

在测试驱动的数据仓库开发中选择与在测试驱动的数据库开发中选择的对象相同的对象没有问题; 但是,构建此对象所需的逻辑必须经过几个步骤。

创建SQL单元测试类 (Create SQL unit test Class)

Once tSQLt is installed please create a SQL unit test class called AuthorReportTests in data warehouse database as follows:

一旦安装了tSQLt,请在数据仓库数据库中创建一个称为AuthorReportTestsSQL单元测试类,如下所示:

USE SQLDevArticlesV5DW;
GO
 
-- Creating unit test class AuthorsReportTests
CREATE SCHEMA [AuthorsReportTests]
Authorization dbo
GO
EXECUTE sp_addextendedproperty @name='tSQLt.TestClass',@value=1,@level0type='SCHEMA',@level0name='AuthorsReportTests'

创建第一个SQL单元测试 (Create the first SQL unit test )

Create the first SQL unit test for the data warehouse database SQLDevArticlesV5DW to check if an object exists as follows:

为数据仓库数据库SQLDevArticlesV5DW创建第一个SQL单元测试,以检查对象是否存在,如下所示:

USE SQLDevArticlesV5dw
GO
CREATE PROCEDURE AuthorsReportTests.[test to check AuthorsReport exists]
AS
BEGIN
  --Assemble
 
  --Act
  
  --Assert
   EXEC tSQLt.AssertObjectExists @ObjectName = N'AuthorsReport'
END;
GO

运行SQL单元测试 (Run SQL unit test(s))

Now please run all SQL unit tests by issuing the following tSQLt command:

现在,请通过发出以下tSQLt命令来运行所有SQL单元测试:

USE SQLDevArticlesV5DW
GO
 
-- Run all unit tests
tSQLt.RunAll

As per test-driven database development principles that test must fail since the object is not yet created as follows:

根据测试驱动的数据库开发原则,测试必须失败,因为尚未创建对象,如下所示:

Test to check if object exists fails

创建对象(AuthorsReport) (Create Object (AuthorsReport))

Now we are going to create the desired object in the same data warehouse database SQLDevArticlesV5DW so the basic test to check its presence must pass:

现在,我们将在同一数据仓库数据库SQLDevArticlesV5DW中创建所需的对象,以便检查其存在的基本测试必须通过:

USE SQLDevArticlesV5dw
GO
 
-- Creating database object AuthorsReport stub (placeholder)
CREATE View AuthorsReport
AS
SELECT 1 AS Stub
;

重新运行SQL单元测试 (Rerun SQL unit test(s))

Please run the SQL unit tests for the sample data warehouse:

请为样本数据仓库运行SQL单元测试:

USE SQLDevArticlesV5DW
GO
 
-- Run all unit tests
tSQLt.RunAll

The test output should be as follows:

测试结果应如下:

Test to check if object exists passed

So far, so good, the proof of concept with tSQLt is on the run, but it is a slightly long way before we see it working more robustly.

到目前为止,到目前为止,tSQLt的概念证明还在运行中,但是要使它更可靠地工作还需要很长的路要走。

结论 (Conclusion)

Congratulations, you have just gone through the halfway, and things seem to be fine, but please be prepared to do some serious work in the next part of this article to get the things on track since this trial must show that it is worth the effort. Stay tuned

恭喜,您已经完成了一半,事情似乎还不错,但是请准备好在本文的下一部分做一些认真的工作,以使事情顺利进行,因为此试验必须表明值得努力。 敬请关注

翻译自: https://www.sqlshack.com/the-concept-of-test-driven-data-warehouse-development-tdwd-with-tsqlt/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值