java选项a转换成数字1_多种选项可将行转换为列

java选项a转换成数字1

Introduction

介绍

One of the primary functions of a Business Intelligence team is to enable business users with an understanding of data created and stored by business systems. Understanding the data should give business users an insight into how the business is performing. A typical understanding of data within an insurance industry could relate to measuring the number of claims received vs successfully processed claims. Such data could be stored in source system as per the layout in Table 1:

商业智能团队的主要职能之一是使业务用户能够理解由业务系统创建和存储的数据。 了解数据应该使业务用户可以洞悉业务的绩效。 对保险行业数据的典型理解可能与衡量已收到的索赔数量与成功处理的索赔数量有关 。 这些数据可以按照表1中的布局存储在源系统中:

Table 1: Sample policy claims data

表1:样本保单索赔数据

RecKey PolID PolNumber PolType Effective Date DocID DocName Submitted
1 2 Pol002 Hospital Cover 01-Oct-07 1 Doc A 0
2 2 Pol002 Hospital Cover 01-Oct-07 4 Doc B 0
3 2 Pol002 Hospital Cover 01-Oct-07 5 Doc C 1
4 2 Pol002 Hospital Cover 01-Oct-07 7 Doc D 1
5 2 Pol002 Hospital Cover 01-Oct-07 10 Doc E 1
记录键 PolID 号码 类型 生效日期 DocID 文件名 已提交
1个 2 Pol002 医院保险 2007年10月1日 1个 文件A 0
2 2 Pol002 医院保险 2007年10月1日 4 文件B 0
3 2 Pol002 医院保险 2007年10月1日 5 文件C 1个
4 2 Pol002 医院保险 2007年10月1日 7 文件D 1个
5 2 Pol002 医院保险 2007年10月1日 10 文件E 1个

Although each data entry in Table 1 has a unique RecKey identifier, it all still relates to a single policy claim (policy Pol002). Thus, a correct representation of this data ought to be in a single row that contains a single instance of policy Pol002 as shown in Table 2:

尽管表1中的每个数据条目都有唯一的RecKey标识符,但它们都仍与单个策略声明(策略Pol002 )有关。 因此,此数据的正确表示应该位于包含策略Pol002的单个实例的单行中 ,如表2所示:

Table 2 Transposed layout

表2换位布局

PolNumber PolType Effective Date Doc A Doc B Doc C Doc D Doc E
Pol002 Hospital Cover 01-Oct-07 0 0 1 1 1
号码 类型 生效日期 文件A 文件B 文件C 文件D 文件E
Pol002 医院保险 2007年10月1日 0 0 1个 1个 1个

The objective of this article is to demonstrate different SQL Server T-SQL options that could be utilised in order to transpose repeating rows of data into a single row with repeating columns as depicted in Table 2. Some of the T-SQL options that will be demonstrated will use very few lines of code to successfully transpose Table 1 into Table 2 but may not necessary be optimal in terms query execution. Therefore, the execution plan and I/O statistics of each T-SQL option will be evaluated and analysed using ApexSQL Plan.

本文的目的是演示不同SQL Server T-SQL选项,如表2所示,这些选项可用于将重复的数据行转换为具有重复列的单行。 将要演示的某些T-SQL选项将使用很少的代码行来成功地将表1转换为表2,但就查询执行而言可能不是最佳的。 因此,将使用ApexSQL Plan评估和分析每个T-SQL选项的执行计划和I / O统计信息。

Option #1: PIVOT

选项1:PIVOT

Using a T-SQL Pivot function is one of the simplest method for transposing rows into columns. Script 1 shows how a Pivot function can be utilised.

使用T-SQL Pivot函数是将行转换为列的最简单方法之一。 脚本1显示了如何使用数据透视功能。

SELECT *
FROM
(
    SELECT [PolNumber],
           [PolType],
           [Effective Date],
           [DocName],
           [Submitted]
    FROM [dbo].[InsuranceClaims]
) AS SourceTable PIVOT(AVG([Submitted]) FOR [DocName] IN([Doc A],
                                                         [Doc B],
                                                         [Doc C],
                                                         [Doc D],
                                                         [Doc E])) AS PivotTable;

The results of executing Script 1 are shown in Figure 1, as it can be seen, the output is exactly similar to that of Table 2.

执行脚本1的结果如图1所示,可以看到,其输出与表2完全相似。

Figure 1

图1

Furthermore, as we add more policy numbers in our dataset (i.e. Pol003), we are able to automatically retrieve them without making any changes to Script 1.

此外,随着我​​们在数据集中添加更多策略编号(即Pol003 ),我们能够自动检索它们而无需对脚本1进行任何更改。

Figure 2

图2

Although, we don’t have to alter the script to show additional policies, we unfortunately have to update it if we need to return more columns. This is because the Pivot function works with only a predefined list of possible fields. Thus, in order to return [Doc F] column, we would firstly need to update the FOR clause in Script 1 to include [Doc F] and only then would the output reflect [Doc F] as shown in Figure 3.

尽管我们不必更改脚本来显示其他策略,但是不幸的是,如果需要返回更多列,则必须对其进行更新。 这是因为数据透视功能仅适用于可能字段的预定义列表。 因此,为了返回[Doc F]列,我们首先需要将脚本1中的FOR子句更新为包含[Doc F] ,然后才将输出反映为[Doc F]如图3所示。

Figure 3

图3

However, imagine if business later decides to add 100 more documents that are required to process a claim? It would mean that you need to update your Pivot script and manually add those 100 fields. Thus, although transposing rows using Pivot operator may seem simple, it may later be difficult to maintain.

但是,想像一下企业以后是否决定添加更多100个处理索赔所需的文档? 这意味着您需要更新数据透视脚本并手动添加这100个字段。 因此,尽管使用Pivot运算符转置行似乎很简单,但以后可能很难维护。

Performance Breakdown

绩效细分

The actual estimated plan depicted in Figure 4, indicates that only a single scan was made against the base table with a majority of the cost (at 77.6%) used for sorting data.

图4中描绘的实际估计计划表明,仅对基本表进行了一次扫描,大部分费用(77.6%)用于排序数据。

Figure 4

图4

In terms of operational tree, the highest increased in I/O was recorded during the Sort operation at 0.01 milliseconds.

就操作树而言,在排序操作期间以0.01毫秒记录了I / O的最高增长。

Figure 5

图5

选项2:游标 (Option #2: CURSOR)

Although the general consensus in the professional community is to stay away from SQL Server Cursors, there are still instances whereby the use of cursors is recommended. I suppose if they were totally useless, Microsoft would have deprecated their usage long ago, right? Anyway, Cursors present us with another option to transpose rows into columns. Script 2 displays a T-SQL code that can be used to transpose rows into columns using the Cursor function.

尽管专业社区的普遍共识是远离SQL Server游标,但是仍然有一些实例建议使用游标。 我想如果它们完全没用的话,微软早就应该弃用它们,对吧? 无论如何,Cursors为我们提供了另一种选择,可以将行转置为列。 脚本2显示了一个T-SQL代码,可使用Cursor函数将行转置为列。

DECLARE @PolNumber NVARCHAR(255), @PolNumber5 NVARCHAR(255), @PolType VARCHAR(255), @DocName 
NVARCHAR(255), @Submitted INT, @Eff DATE, @message_T NVARCHAR(MAX);
SET @message_T = '';
SET @PolNumber5 = '';
DECLARE policyDocs_csr CURSOR
FOR
    SELECT [PolNumber],
           [PolType],
           [Effective Date],
           [DocName],
           [Submitted]
    FROM [dbo].[InsuranceClaims]
    ORDER BY [PolNumber];
OPEN policyDocs_csr;
FETCH NEXT FROM policyDocs_csr INTO @PolNumber, @PolType, @Eff, @DocName, @Submitted;
WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @PolNumber5 <> @PolNumber
            SET @message_T = @message_T+CHAR(13)+@PolNumber+' | '+@PolType+' | '+CONVERT(VARCHAR, 
@eff)+' | '+@DocName+' ( '+CONVERT(VARCHAR, isnull(@submitted, ''))+' ) | ';
            ELSE
        IF @PolNumber5 = @PolNumber
            SET @message_T = @message_T+@DocName+' ( '+CONVERT(VARCHAR, isnull(@submitted, ''))+' ) | 
';
        SET @PolNumber5 = @PolNumber;
        FETCH NEXT FROM policyDocs_csr INTO @PolNumber, @PolType, @Eff, @DocName, @Submitted;
    END;
IF @@FETCH_STATUS <> 0
    PRINT @message_T;
CLOSE policyDocs_csr;
DEALLOCATE policyDocs_csr;

Execution of Script 2 lead to the result set displayed in Figure 6 yet, the Cursor option uses far more lines of code than its T-SQL Pivot counterpart.

执行脚本2导致结果集如图6所示 ,但Cursor选项使用的代码行比T-SQL Pivot对应的多得多。

Figure 6

图6

Similar to the Pivot function, the T-SQL Cursor has the dynamic capability to return more rows as additional policies (i.e. Pol003) are added into the dataset, as shown in Figure 7:

类似于Pivot函数,T-SQL游标具有动态功能,可以在向数据集中添加其他策略(即Pol003 )时返回更多行, 如图7所示:

Figure 7

图7

However, unlike the Pivot function, the T-SQL Cursor is able to expand to include newly added fields (i.e. [Doc F]) without having to make changes to the original script.

但是,与Pivot函数不同,T-SQL光标可以扩展为包括新添加的字段(即[Doc F] ),而无需更改原始脚本。

Figure 9

图9

Performance Breakdown

绩效细分

The major limitation of transposing rows into columns using T-SQL Cursor is a limitation that is linked to cursors in general – they rely on temporary objects, consume memory resources and processes row one at a time which could all result into significant performance costs. Thus, unlike in the Pivot function wherein the majority of the cost was spent sorting the dataset, the majority of cost in the Cursor option is split between the Sort operation (at 46.2%) as well as the temporary TempDB object (at 40.5%).

使用T-SQL游标将行转换为列的主要限制是通常与游标链接的限制–它们依赖临时对象,消耗内存资源并一次处理第一行,所有这些都可能导致巨大的性能成本。 因此,与Pivot函数(其中大部分成本用于对数据集进行排序)不同,Cursor选项中的大部分成本在Sort操作(占46.2%)和临时TempDB对象(占40.5%)之间分配。 。

Figure 10

图10

Similar to the operational tree of the Pivot function, the operator with the higher percentages in the execution plan of the Cursor function are likely to consume more I/O resources than other operators. In this case, both the Sort and temporary TempDB objects recorded the most I/O usage cost at 0.01 milliseconds each.

与Pivot函数的操作树类似,在Cursor函数的执行计划中具有较高百分比的操作员可能比其他操作员消耗更多的I / O资源。 在这种情况下,排序和临时TempDB对象都记录了最多的I / O使用成本,每个成本为0.01毫秒。

Figure 11

图11

选项#3:XML (Option #3: XML)

The XML option to transposing rows into columns is basically an optimal version of the PIVOT in that it addresses the dynamic column limitation. The XML version of the script addresses this limitation by using a combination of XML Path, dynamic T-SQL and some built-in T-SQL functions such as STUFF and QUOTENAME. The version of the script that uses XML function to transpose rows into columns is shown in Script 3.

将行转换为列的XML选项基本上是PIVOT的最佳版本,因为它解决了动态列限制。 脚本的XML版本通过结合使用XML Path,动态T-SQL和一些内置的T-SQL函数(例如STUFFQUOTENAME)来解决此限制。 脚本3中显示了使用XML函数将行转置为列的脚本版本

DECLARE @cols NVARCHAR(MAX), @query NVARCHAR(MAX);
SET @cols = STUFF(
                 (
                     SELECT DISTINCT
                            ','+QUOTENAME(c.[DocName])
                     FROM [dbo].[InsuranceClaims] c FOR XML PATH(''), TYPE
                 ).value('.', 'nvarchar(max)'), 1, 1, '');
SET @query = 'SELECT [PolNumber], '+@cols+'from (SELECT [PolNumber],
           [PolType],
           [submitted] AS [amount],
           [DocName] AS [category]
    FROM [dbo].[InsuranceClaims]
    )x pivot (max(amount) for category in ('+@cols+')) p';
EXECUTE (@query);

The output of Script 3 execution is shown in Figure 12.

脚本3执行的输出如图12所示。

Figure 12

图12

Similar to T-SQL Pivot and Cursor options, newly added policies (i.e. Pol003) are retrievable in the XML option without having to update the original script. Furthermore, the XML option is also able to cater for dynamic field names (i.e. [Doc F]) as shown in Figure 13.

与T-SQL Pivot和Cursor选项类似,可以在XML选项中检索新添加的策略(即Pol003 ),而不必更新原始脚本。 此外,XML选项还能够满足动态字段名称(即[Doc F] ), 如图13所示。

Figure 13

图13

绩效细分 (Performance Breakdown)

The execution plan of Script 3 is almost similar to that of the Pivot function script in that majority of the cost is taken up by the Sort operator with the Table scan being the second most costly operation.

脚本3的执行计划与Pivot函数脚本的执行计划几乎类似,因为大部分费用由Sort运算符承担,而Table扫描是第二昂贵的操作。

Figure 14

图14

In terms of I/O cost, the Sort operation used the longest time at 0.01 milliseconds.

在I / O成本方面,排序操作使用的时间最长,为0.01毫秒。

Figure 15

图15

选项4:动态SQL (Option #4: Dynamic SQL)

Another alternative to the optimal XML option is to transpose rows into columns using purely dynamic SQL – without XML functions. This option utilises the same built-in T-SQL functions that are used in the XML option version of the script as shown in Script 4.

最佳XML选项的另一种替代方法是使用纯动态SQL(没有XML函数)将行转置为列。 此选项利用脚本的XML选项版本中使用的相同的内置T-SQL函数,如脚本4所示。

DECLARE @columns NVARCHAR(MAX), @sql NVARCHAR(MAX);
SET @columns = N'';
SELECT @columns+=N', p.'+QUOTENAME([Name])
FROM
(
    SELECT [DocName] AS [Name]
    FROM [dbo].[InsuranceClaims] AS p
    GROUP BY [DocName]
) AS x;
SET @sql = N'
SELECT [PolNumber], '+STUFF(@columns, 1, 2, '')+' FROM (
SELECT [PolNumber], [Submitted] AS [Quantity], [DocName] as [Name] 
    FROM [dbo].[InsuranceClaims]) AS j PIVOT (SUM(Quantity) FOR [Name] in 
	   ('+STUFF(REPLACE(@columns, ', p.[', ',['), 1, 1, '')+')) AS p;';
EXEC sp_executesql

Script 4: Transpose data using Dynamic SQL function

脚本4:使用动态SQL函数转置数据

Again, like all the other options, the script using Dynamic SQL returns data in a correctly transposed layout. Similar to T-SQL Cursor and XML options, Dynamic SQL is able to cater for newly added rows and columns without any prior updates to the script.

同样,与所有其他选项一样,使用Dynamic SQL的脚本以正确转置的布局返回数据。 与T-SQL Cursor和XML选项类似,Dynamic SQL能够满足新添加的行和列,而无需对该脚本进行任何先前的更新。

Figure 16

图16

Performance Breakdown

绩效细分

Except for using XML functions, the Dynamic SQL option is very similar to the XML option. It is not surprising then that its execution plan and operations tree will look almost similar to that of the XML option.

除了使用XML函数外,“动态SQL”选项与XML选项非常相似。 毫不奇怪,它的执行计划和操作树看起来几乎类似于XML选项。

Figure 17

图17

Figure 18

图18

结论 (Conclusion)

In this article, we’ve had a look at available T-SQL options for transposing rows into columns. The Pivot option was shown to be the simplest option yet its inability to cater for dynamic columns made it the least optimal option. The T-SQL Cursor option addressed some of the limitations of the Pivot option though at a significant cost of resources and SQL Server performance. Finally, the XML and the Dynamic SQL options proved to be the best optimal options in terms of transposing rows into columns with favorable performance results and effective handling dynamic rows and columns.

在本文中,我们研究了将行转换为列的可用T-SQL选项。 枢轴选项被证明是最简单的选项,但是它无法满足动态列的需求,使其成为最不理想的选项。 尽管花费了大量资源和SQL Server性能,但T-SQL Cursor选项解决了Pivot选项的一些限制。 最后,就将行转换为具有良好性能结果并有效处理动态行和列的列而言,XML和Dynamic SQL选项被证明是最佳的最佳选择。

资料下载 (Downloads)

参考资料 (References)

翻译自: https://www.sqlshack.com/multiple-options-to-transposing-rows-into-columns/

java选项a转换成数字1

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值