percent sql_使用SQL Server中的PERCENT_RANK函数计算SQL百分位数

percent sql

This article explores the SQL Server PERCENT_RANK analytical function to calculate SQL Percentile and its usage with various examples.

本文通过各种示例探讨了SQL Server PERCENT_RANK分析函数,以计算SQL Percentile及其用法。

PERCENT_RANK()函数概述 (Overview of PERCENT_RANK() function)

In a relational database, we store the data in SQL tables. Many times, we fetch the data without any data manipulations. Let’s say you want to generate the SSRS report that contains the top 5% sales for a product in a region. In this case, you need to use SQL Server Analytical functions to perform the data manipulations such as SQL Percentile.

在关系数据库中,我们将数据存储在SQL表中。 很多时候,我们无需任何数据操作即可获取数据。 假设您要生成SSRS报告,其中包含该地区某产品的销售额最高的5%。 在这种情况下,您需要使用SQL Server Analytical函数来执行数据操作,例如SQL Percentile。

We have the following analytical functions in SQL Server.

在SQL Server中,我们具有以下分析功能。

  • CUME_DIST()

    CUME_DIST()
  • FIRST_VALUE()

    FIRST_VALUE()
  • LAST_VALUE()

    LAST_VALUE()
  • LEAD()

    铅()
  • LAG()

    落后()
  • PERCENTILE_COUNT()

    PERCENTILE_COUNT()
  • PERCENTILE_DISC()

    PERCENTILE_DISC()
  • PERCENT_RANK()

    PERCENT_RANK()

You can also direct to SQL Rank functions to understand how you can specify rank to each row of your result set over a partition using these window functions.

您还可以定向到SQL Rank函数,以了解如何使用这些窗口函数在分区上的结果集的每一行指定排名。

PERCENT_RANK() (PERCENT_RANK())

The PERCENT_RANK function in SQL Server calculates the relative rank SQL Percentile of each row. It always returns values greater than 0, and the highest value is 1. It does not count any NULL values. This function is nondeterministic.

SQL Server中的PERCENT_RANK函数计算每行的相对排名SQL百分位数。 它始终返回大于0的值,并且最大值为1。它不计算任何NULL值。 此功能是不确定的。

The syntax of the PERCENT_RANK() function is as below:

PERCENT_RANK()函数的语法如下:

PERCENT_RANK() OVER (
      [PARTITION BY partition_expression, ... ]
      ORDER BY sort_expression [ASC | DESC], ...
  )

In this syntax, we use the following parameters.

在此语法中,我们使用以下参数。

  • PARTITION BY: By default, SQL Server treats the whole data set as a single set. We can specify the PARTITION By clause to divide data into multiple sets. On each set, the Percent_Rank function performs the analytical calculations. It is an optional parameter

    PARTITION BY:默认情况下,SQL Server将整个数据集视为单个集合。 我们可以指定PARTITION By子句将数据分为多个集合。 在每个集合上,Percent_Rank函数都会执行分析计算。 它是一个可选参数
  • Order By: We use Order by clause to sort the data in either ascending or descending order. It is a compulsory parameter

    Order By:我们使用Order by子句对数据进行升序或降序排序。 这是一个强制性参数

Let’s look at the examples to understand it.

让我们看一些例子来理解它。

示例1:PERCENT_RANK函数,用于计算学生成绩SQL百分位数 (Example 1: PERCENT_RANK function for calculating the SQL Percentile of student marks)

In this example, let’s use a CTP ( common table expression) to hold the student marks, and we use PERCENT_RANK function from this data for marks in ascending order.

在此示例中,让我们使用CTP(公用表表达式)来保存学生成绩,然后使用该数据中的PERCENT_RANK函数以升序排列。

WITH Student
       AS (SELECT Marks
           FROM(VALUES(75), (59), (69), (88), (72), (86), (60), (67), (49), (81), (93)) Val(Marks)),
       Percentiles
       AS (SELECT Marks, 
                  PERCENT_RANK() OVER(
                  ORDER BY Marks) AS Percent_Rank
           FROM Student)
       SELECT *
       FROM Percentiles;

In the output, we can see following outcomes

在输出中,我们可以看到以下结果

  • A lowest mark 49 gets Percent_Rank value zero

    最低分数49的Percent_Rank值为零
  • The highest marks 93 gets Percent_Rank value one

    最高分93获得Percent_Rank值1

SQL Percentile function

示例2:用于计算具有NULL值SQL Percentile的PERCENT_RANK函数 (Example 2: PERCENT_RANK function with to calculate SQL Percentile having NULL values)

Let’s add a NULL value in the student marks and rerun the query from example 1.

让我们在学生标记中添加一个NULL值,然后重新运行示例1中的查询。

WITH Student
       AS (SELECT Marks
           FROM(VALUES(75), (59), (69), (88), (72), (86), (60), (67), (49), (81), (93),(NULL)) Val(Marks)),
       Percentiles
       AS (SELECT Marks, 
                  PERCENT_RANK() OVER(
                  ORDER BY Marks) AS Percent_Rank
           FROM Student)
       SELECT *
       FROM Percentiles;

As shown in the following screenshot, you always get zero for the NULL values.

如以下屏幕截图所示,对于NULL值,您总会得到零。

PERCENT_RANK function with NULL values

示例3:PERCENT_RANK函数来计算具有重复值SQL Percentile (Example 3: PERCENT_RANK function to calculate SQL Percentile having duplicate values)

Suppose multiple students are having similar marks, and we want to calculate the percent rank. It is important to know that the function will return the same or different rank for a student with similar marks.

假设多个学生的分数相似,并且我们要计算排名百分比。 重要的是要知道,该功能将为具有相似分数的学生返回相同或不同的等级。

	WITH Student
	     AS (SELECT Marks
	         FROM(VALUES(75), (59), (69), (88), (72), (86), (60), (67), (49), (81),(81), (93)) Val(Marks)),
	     Percentiles
	     AS (SELECT Marks, 
	                PERCENT_RANK() OVER(
	                ORDER BY Marks) AS Percent_Rank
	         FROM Student)
	     SELECT *
	     FROM Percentiles;

We have two students with marks 81. In the following example, we can see that percent_rank function returns the same value.

我们有两个标记为81的学生。在下面的示例中,我们可以看到percent_rank函数返回相同的值。

PERCENT_RANK function with Duplicate values

Here is a catch, Let’s say two students have got the highest marks 93 and we want to calculate the percent_rank for these.

这里有个问题,假设有两个学生的最高分是93,我们要计算这些分数的percent_rank。

WITH Student
       AS (SELECT Marks
           FROM(VALUES(75), (59), (69), (88), (72), (86), (60), (67), (49), (81),(93), (93)) Val(Marks)),
       Percentiles
       AS (SELECT Marks, 
                  PERCENT_RANK() OVER(
                  ORDER BY Marks) AS Percent_Rank
           FROM Student)
       SELECT *
       FROM Percentiles;

In this example, we have the highest marks 93, and two students have the same marks. Previously, we get percent rank 1 for the highest rank. Now, we do not get the rank 1, and it calculates the percent rank according to the similar highest marks.

在此示例中,我们的最高分是93,而两个学生的分是相同的。 以前,我们获得最高排名的百分比排名1。 现在,我们没有获得等级1,而是根据相似的最高分数计算百分比等级。

PERCENT_RANK function with Duplicate values

示例4:PERCENT_RANK函数以降序计算带有标记SQL Percentile (Example 4: PERCENT_RANK function to calculate SQL Percentile with marks in descending order)

In previous examples, we did not specify ascending or descending order for the marks. By default, it sorts the data in ascending order. We can specify the DESC clause to sort the data in descending order.

在前面的示例中,我们没有指定标记的升序或降序。 默认情况下,它将按升序对数据进行排序。 我们可以指定DESC子句对数据进行降序排序。

WITH Student
       AS (SELECT Marks
           FROM(VALUES(75), (59), (69), (88), (72), (86), (60), (67), (49), (81),(93)) Val(Marks)),
       Percentiles
       AS (SELECT Marks, 
                  PERCENT_RANK() OVER(
                  ORDER BY Marks desc) AS Percent_Rank
           FROM Student)
       SELECT *
       FROM Percentiles;

In the descending order, we can see that the highest marks get 0 ranks and the lowest marks get rank 1.

以降序排列,我们可以看到最高的标记获得0等级,最低的标记获得1等级。

PERCENT_RANK function to calculate SQL Percentile with marks in descending order

In the descending order, for the student with similar highest marks always get zero rank.

对于降序排列,得分最高的学生总是获得零分。

WITH Student
       AS (SELECT Marks
           FROM(VALUES(75), (59), (69), (88), (72), (86), (93),(60), (67), (49), (81),(93)) Val(Marks)),
       Percentiles
       AS (SELECT Marks, 
                  PERCENT_RANK() OVER(
                  ORDER BY Marks desc) AS Percent_Rank
           FROM Student)
       SELECT *
       FROM Percentiles;

PERCENT_RANK function to calculate SQL Percentile with marks in descending order

Similar to example 3, two students with the lowest marks do not get percent_rank 1.

与示例3相似,得分最低的两个学生也没有获得percent_rank 1。

PERCENT_RANK function with duplicate lowest and highest values

示例4:带有PARTITION BY子句SQL Percentile的PERCENT_RANK函数 (Example 4: PERCENT_RANK function for SQL Percentile with PARTITION BY clause)

In the previous example, we did not use the PARTITION by clause to divide the data into multiple small data sets.

在前面的示例中,我们没有使用PARTITION by子句将数据分为多个小数据集。

Suppose we want to check the Percent_Rank of an employee pay rate per the department name from the AdventureWorks sample database. In the following query, we use PARTITION BY Department, ORDER BY Rate to define partition on the department as per the rates in the ascending order.

假设我们要从AdventureWorks示例数据库中按部门名称检查员工工资率的Percent_Rank。 在以下查询中,我们使用PARTITION BY Department,ORDER BY Rate来按照升序中的比率在部门上定义分区。

SELECT Department, 
         Firstname, 
         Rate, 
         PERCENT_RANK() OVER(PARTITION BY Department
         ORDER BY Rate) AS PctRank
  FROM HumanResources.vEmployeeDepartmentHistory AS edh
       INNER JOIN HumanResources.EmployeePayHistory AS e ON e.BusinessEntityID = edh.BusinessEntityID
  WHERE Department IN(N'Engineering', N'Finance')
  ORDER BY Department, 
           Rate DESC;

In the following image, we have two departments group, Engineering and Finance.

在下图中,我们有两个部门组,即工程部门和财务部门。

  • For the Engineering group, Terri has the highest pay rate, and its percent_rank is 1

    对于工程组,Terri的薪水最高,其percent_rank为1
  • Rob has the lowest pay rate, and it gets percent_rank zero in the engineering group

    Rob的薪水最低,在工程组中它的percent_rank为零
  • In another group, Finance, Laura gets percent_rank 1 and David gets percent_rank zero according to their pay rates

    在另一组财务中,劳拉根据他们的薪水比率获得percent_rank 1,而David获得percent_rank 0

PERCENT_RANK function with  PARTITION BY clause

If we do not want to have partitions, remove the PARTITION BY clause and it treats the input as single data set.

如果我们不希望有分区,请删除PARTITION BY子句,它将输入视为单个数据集。

SELECT Department, 
         Firstname, 
         Rate, 
         PERCENT_RANK() OVER(
         ORDER BY Rate) AS PctRank
  FROM HumanResources.vEmployeeDepartmentHistory AS edh
       INNER JOIN HumanResources.EmployeePayHistory AS e ON e.BusinessEntityID = edh.BusinessEntityID
  WHERE Department IN(N'Engineering', N'Finance')
  ORDER BY Department, 
           Rate DESC;

PERCENT_RANK function without PARTITION BY clause

结论 (Conclusion)

In this article, we learned the SQL PERCENT_RANK() function for calculating SQL Percentile. It calculates the relative rank of a row within a group or subset of data. Here is a quick summary of what we learned about SQL Server PERCENT_RANK() function in this article:

在本文中,我们学习了用于计算SQL PercentileSQL PERCENT_RANK()函数。 它计算一组数据或子集中的一行的相对排名。 这是我们在本文中了解到的有关SQL Server PERCENT_RANK()函数的简要概述:

  • PERCENT_RANK calculate the relative rank of a row within a group of rows

    PERCENT_RANK计算一组行中一行的相对排名
  • We use the sort data in ascending or descending order using order by clause

    我们使用order by子句以升序或降序使用排序数据
  • We can partition data using the PARTITION BY clause, and it divides data into specific small data sets

    我们可以使用PARTITION BY子句对数据进行分区,并将其分为特定的小数据集
  • A NULL value also gets percent_rank value zero in the output

    NULL值也会在输出中使percent_rank值变为零

You should learn this SQL Server Analytical function and be familiar with this.

您应该学习并熟悉此SQL Server分析功能。

翻译自: https://www.sqlshack.com/calculate-sql-percentile-using-the-sql-server-percent_rank-function/

percent sql

  • 1
    点赞
  • 0
    评论
  • 4
    收藏
  • 一键三连
    一键三连
  • 扫一扫,分享海报

表情包
插入表情
评论将由博主筛选后显示,对所有人可见 | 还能输入1000个字符
©️2021 CSDN 皮肤主题: 编程工作室 设计师:CSDN官方博客 返回首页
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值