t–sql pl–sql_SQL串联正确完成–第1部分–可疑做法

t–sql pl–sql

This article is a part of three articles series to explore SQL Concatenation techniques.

本文是探索SQL连接技术的三篇文章系列的一部分。

Having to represent sets of data as strings is a very common requirement in information management, even in modern times where a variety of more or less elaborate standards for storing, and moving, data are at our disposal. For instance, XML, JSON, or similar techniques, allow the data to be extracted from one data source, using a well-known standard, and be stored temporarily until being loaded into a destination data store, or until being consumed in some other way. Actually, both XML as well as JSON might even be used as a standard way of storing data permanently; especially, if the consumers expect the data to use one or the other format.

必须将数据集表示为字符串是信息管理中非常普遍的要求,即使在现代时代,存储或移动数据的各种或多或少复杂标准都可以由我们支配。 例如,XML,JSON或类似技术允许使用众所周知的标准从一个数据源中提取数据,并暂时存储该数据,直到将其加载到目标数据存储中或以其他方式使用为止。 实际上,XML和JSON甚至都可以用作永久存储数据的标准方法。 特别是如果使用者希望数据使用一种或另一种格式。

Depending on the client application, the data retrieved from a database management system, can be transformed in different ways. To some consumers (or destinations) the metadata at the origin is important (for instance, in Microsoft Excel it might be crucial to preserve the information that a particular value is a string representation of a numerical value), whereas to some client applications the metadata as set at the origin is irrelevant (for instance, when an HTML document is rendered, practically all of its contents are treated as strings of characters, regardless of the actual data type or domain used at the source).

根据客户端应用程序,可以以不同方式转换从数据库管理系统检索到的数据。 对于某些使用者(或目的地)而言, 原始 数据重要 (例如,在Microsoft Excel中,保留特定值是数字值的字符串表示形式的信息可能至关重要),而对于某些客户端应用程序,元数据与原始设置无关 (例如,呈现HTML文档时,实际上,其所有内容都视为字符串),而与源使用的实际数据类型或域无关。

What exactly am I talking about? Imagine a database about books and their authors (such as the pubs sample database that came with SQL Server 2000). A book can be written by one or more authors, and each author can write one or more books. If you wanted to create a list of books containing their attributes, including the list of authors as a single table, you could perform SQL concatenate for the full names of the authors into a single string and treat them as a single attribute, as shown in Figure 1 below.

我到底在说什么 想象一个有关书籍及其作者的数据库(例如SQL Server 2000附带的pubs示例数据库)。 一本书可以由一位或多位作者撰写,每位作者可以编写一本书或多本书。 如果要创建一个包含其属性的书籍列表,包括以单个表形式包含作者列表,则可以将作者的全名执行SQL串联成单个字符串,并将它们视为一个属性,如下所示:下面的图1。

 
title_id title           price
-------- --------------- -----
TC7777   Sushi, Anyone?  14,99
 
(1 row(s) affected)
 
title_id au_lname    au_fname
-------- ----------- --------
TC7777   O'Leary     Michael
TC7777   Gringlesby  Burt
TC7777   Yokomoto    Akiko
 
(3 row(s) affected)
 
title_id title           price authors
-------- --------------- ----- ---------------------------------------------------
TC7777   Sushi, Anyone?  14,99 Yokomoto, Akiko; O'Leary, Michael; Gringlesby, Burt
 
(1 row(s) affected)
 

Figure 1: Titles and authors (from the pubs sample database).

图1:标题和作者(来自pubs示例数据库)。

Typically, this transformation from metadata-rich data to raw data would be performed by the client application; however, there are cases where the client application is incapable of performing such transformations (for instance, when the client application expects a string representation of date/time values in accordance with a specific standard, different from the one used by the data source), or a client application might not even exist (for instance, when data is exported from a data source, and it is not possible to determine in advance what the metadata requirements of the data destination will be).

通常,从富含元数据的数据到原始数据的这种转换将由客户端应用程序执行。 但是,在某些情况下,客户端应用程序无法执行此类转换(例如,当客户端应用程序期望根据特定标准(不同于数据源所使用的标准)以日期/时间值的字符串表示形式时),或者甚至可能不存在客户端应用程序(例如,当从数据源导出数据时,无法预先确定数据目标的元数据要求是什么)。

Primarily, SQL Server provides a few standard ways to access data. By using the Transact-SQL querying language, the results of the queries can be consumed by a variety of data providers – either as row sets (by using the basic SELECT statement), or even as XML (by using the SELECT statement with the FOR XML directive). It is, however, possible to extend the built-in capabilities with custom programmatic logic.

首先,SQL Server提供了几种访问数据的标准方法。 通过使用Transact-SQL查询语言,查询的结果可以由各种数据提供程序使用-作为行集(通过使用基本的SELECT语句),或者甚至作为XML(通过将SELECT语句与FOR一起使用) XML指令)。 但是,可以使用自定义编程逻辑来扩展内置功能。

In this three-part article I will present two reliable, and efficient, techniques for representing data sets as delimited strings. Both techniques will use native SQL Server capabilities that have been available since SQL Server 2005; however, for the purposes of this particular article I will be using SQL Server 2012, which will allow me to significantly simplify one of the techniques.

在这篇由三部分组成的文章中,我将介绍两种可靠且有效的技术,这些技术将数据集表示为定界字符串。 两种技术都将使用自SQL Server 2005起可用的本机SQL Server功能。 但是,出于这篇特定文章的目的,我将使用SQL Server 2012,这将使我能够显着简化其中一种技术。

Now, before you learn about the two good options, I have to point out a couple of inappropriate ones.

现在,在您了解这两种不错的选择之前,我必须指出一些不合适 的选择

可疑做法 (Dubious Practices)

In Transact-SQL it is possible to use (actually, I should say misuse) a native data retrieval method in order to concatenate a set of values into a single string value. The technique is referred to as variable assignment using the SELECT statement; it relies on the way data is consumed and assigned to a variable when the SELECT statement is executed.

在Transact-SQL中,可以使用(实际上,我应该说是滥用 )一种本机数据检索方法,以便将一组值连接成一个字符串值。 使用SELECT语句将该技术称为变量分配 ; 它依赖于执行SELECT语句时消耗数据并将其分配给变量的方式。

The SELECT statement can be used to assign values to T-SQL variables, as shown in Figure 2 below.

SELECT语句可用于将值分配给T-SQL变量,如下图2所示。

 
DECLARE @var INT = 0
 
SELECT @var = object_id
  FROM sys.objects
 

Figure 2: The SELECT statement can be used to assign values to variables.

图2:SELECT语句可用于将值分配给变量。

Note that the above query does not use any restrictions; therefore multiple object_id values will be retrieved, and assigned to the @var variable. Because the variable can only hold a single value, after the rows have been processed only a single assignment will prevail. Relational theory defines the set as an unordered collection of elements; therefore the only way to guarantee order in a retrieval query is to use the ORDER BY clause. The above query does not specify order; therefore the database engine is free to choose any order it deems appropriate to retrieve the data, and perform the assignment. As a consequence, it is impossible to predict which object_id value will be assigned to the variable when the query execution is completed.

注意,上面的查询没有任何限制。 因此,将检索多个object_id值,并将其分配给@var变量。 因为变量只能保存一个值,所以在处理完行之后,仅以单个赋值为准。 关系理论将集合定义为元素的无序集合。 因此,保证检索查询中顺序的唯一方法是使用ORDER BY子句。 上面的查询未指定顺序; 因此,数据库引擎可以自由选择它认为合适的任何顺序来检索数据并执行分配。 结果,不可能在查询执行完成时预测将哪个object_id值分配给变量。

The result of a SELECT statement is a set; that is, zero, one or more, values, depending on the source data and the restrictions used in the query. The source of the values can be a column, a variable, an expression, or even a subquery. The query in Figure 3 uses an expression to assign values to the variable, and this expression combines the previously assigned value with the value retrieved from the column. It performs SQL Concatenation using SQL Plus (+) operator.

SELECT语句的结果是一个集合; 即零,一个或多个值,具体取决于源数据和查询中使用的限制。 值的来源可以是列,变量,表达式甚至子查询。 图3中的查询使用表达式将值分配给变量,并且该表达式将先前分配的值与从列中检索的值结合在一起。 它使用SQL Plus(+)运算符执行SQL串联。

 
DECLARE @var BIGINT = 0
 
SELECT @var = @var + object_id
  FROM sys.objects
 

Figure 3: Expressions can be used to assign values to variables.

图3:表达式可用于为变量分配值。

In this particular case, where the expression is an addition of numerical values, the order of the assignments is irrelevant; however, in the end the variable will hold a sum of all object_id values.

在这种特殊情况下,如果表达式是数值的加法运算,则分配的顺序无关紧要; 但是,最后该变量将包含所有object_id值的总和。

The query shown in Figure 4 can be used to retrieve all object_id values concatenated into a single string.

图4中所示的查询可用于检索串联到单个字符串中的所有object_id值。

In the following example, we can see SQL concatenate where expression is an addition of numerical values.

在以下示例中,我们可以看到SQL串联,其中expression是数字值的加法。

 
DECLARE @str VARCHAR(MAX) = ''
 
SELECT @str = @str + CAST(object_id as varchar(20)) + ','
  FROM sys.objects
 

Figure 4: Variable assignment using the SELECT statement could, theoretically, be used for SQL concatenation.

图4:理论上,使用SELECT语句进行变量分配可用于SQL串联。

However, this particular syntax still does not guarantee the expected results. The behaviour of variable assignments using the SELECT statement has been a subject of many discussions in the past, and a comprehensive explanation is also available in the Microsoft Knowledge Base. I urge you to read through the article entitled “PRB: Execution Plan and Results of Aggregate Concatenation Queries Depend upon Expression Location”, available at http://support2.microsoft.com/default.aspx?scid=287515, where the problems with the above technique are discussed in more detail.

但是,此特定语法仍然不能保证预期结果。 过去,使用SELECT语句进行变量赋值的行为一直是许多讨论的主题,并且Microsoft知识库中也提供了全面的说明。 我敦促你们通过一篇题为读“PRB:执行计划和聚合串联查询的结果取决于 表达式位置 ”,可在http://support2.microsoft.com/default.aspx?scid=287515 ,其中有问题以上技术将进行更详细的讨论。

Most of all, I urge you not to use the variable assignment technique to create string representations of data sets for a very simple reason: the behaviour of the above query is undefined.

最重要的是,我敦促您不要使用变量分配技术来创建数据集的字符串表示形式,原因很简单: 上述查询 行为 undefined

With the introduction of XML in SQL Server 2005, a significantly more appropriate alternative has become available. As you surely know, XML has been a native data type since SQL Server 2005; it is a complex data type – supported by a set of built-in retrieval and manipulation methods – that even comes with its own querying language: the XPath expressions and the XML Query (XQuery).

随着SQL Server 2005中XML的引入,已经出现了更为合适的替代方法。 如您所知,从SQL Server 2005开始,XML一直是本机数据类型。 它是一种复杂的数据类型,由一组内置的检索和操作方法支持,甚至带有其自己的查询语言 :XPath表达式和XML查询(XQuery)。

SQL Server also supports XML composition, the ability to create XML documents using T-SQL, which can be used to SQL concatenate a set of values and represent them as a single value – an XML document. For instance, by using an XML composition expression in a nested SELECT statement, it is possible to create a delimited string containing multiple object_id values, as shown in Figure 5 below.

SQL Server还支持XML 合成 ,即使用T-SQL创建XML文档的功能,该功能可用于SQL连接一组值并将它们表示为单个值– XML文档。 例如,通过在嵌套的SELECT语句中使用XML合成表达式,可以创建一个包含多个object_id值的定界字符串,如下图5所示。

In the following example, we can see SQL Concatenate to create a delimited string.

在下面的示例中,我们可以看到SQL Concatenate创建一个分隔字符串。

 
SELECT str
  = (
  SELECT ',' + CAST(object_id AS VARCHAR(20))
    FROM sys.objects
    FOR XML PATH('')
  )
 

Figure 5: A simple XML composition – simple, yet unsafe.

图5:简单的XML组成–简单但不安全。

If the TYPE option is omitted from the FOR XML directive, the composed XML data is returned using SQL Concatenate as text (NVARCHAR(MAX), to be exact). Therefore, while this may seem like the perfect solution, treating XML as if it were just a string of characters might not be a very good idea, as demonstrated in Figure 6 below.

如果FOR XML指令中省略了TYPE选项,则使用SQL Concatenate作为文本 (准确地说是NVARCHAR(MAX))返回组成的XML数据。 因此,尽管这似乎是一个完美的解决方案,但是将XML视为一串字符可能不是一个好主意,如下面的图6所示。

 
SELECT str
  = (
  SELECT '>' + CAST(object_id AS VARCHAR(20))
    FROM sys.objects
    FOR XML PATH('')
  )
 
 
Results:
 
str
-------------------------------------------------------------------------------
>3>5>6>7>8>9>17>18>19>20>21 ... >2137058649
 
(1 row(s) affected)
 

Figure 6: XML is not just a string.

图6:XML不仅仅是一个字符串。

In the result of the query above, the greater-than sign (>) is returned as an HTML entity instead of the literal value. This allows the resulting string to be considered a well-formed representation of an XML document, so that any standard XML parser can convert it to an actual XML document or XML fragment. But would a human have expected to see this?

在上面的查询结果中,大于号(>)作为HTML实体而不是文字值返回。 这样就可以将结果字符串视为XML文档的格式正确的表示形式,以便任何标准XML解析器都可以将其转换为实际的XML文档或XML片段。 但是人类会期望看到这种情况吗?

In part two I will demonstrate how to correctly utilize XML composition to represent data sets as delimited strings. I would now kindly ask you to forget both techniques shown above as soon as possible.

在第二部分中,我将演示如何正确利用XML组合来将数据集表示为定界字符串。 我现在请您尽快忘记 上面显示的 两种技术

翻译自: https://www.sqlshack.com/string-concatenation-done-right-part-1-dubious-practices/

t–sql pl–sql

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值