Find Nth maximum value in SQL Server

http://www.sqlteam.com/article/find-nth-maximum-value-in-sql-server

By Guest Authors on 11 April 2004 | 13 Comments | Tags: SELECT

 


 

This aritlce is written by Hariharan Velayuthan. He writes "There are several methods to find out the Nth maximum/minimum value using SQL. This article discusses on such method to find Nth maximum value from a desired table. This article is aimed at users who are in the beginner or intermediate level in SQL Server."

[Note: This article assumes that the reader is familiar with the T-SQL, joins and queries]

I have taken utmost effort to make this article easy to understand, but incase you are not clear with the concept, please raise up your concern and I’ll be more than happy to attend your doubts. All the examples discussed in this article uses Employee table. If you do not have this table, please use the following script to create it.

  
  
Use  Pubs
Go

Create   table  Employee
(
Eid 
int ,
Name 
varchar ( 10 ),
Salary 
money
)
Go

Insert   into  Employee  values  ( 1 , ' harry ' , 3500 )
Insert   into  Employee  values  ( 2 , ' jack ' , 2500 )
Insert   into  Employee  values  ( 3 , ' john ' , 2500 )
Insert   into  Employee  values  ( 4 , ' xavier ' , 5500
Insert   into  Employee  values  ( 5 , ' steven ' , 7500
Insert   into  Employee  values  ( 6 , ' susana ' , 2400
Go

A simple query that can find the employee with the maximum salary, would be:

  
  
Select   *   from  Employee  where  salary  =  ( Select   max (Salary)  from  Employee)

How does this query work?

The SQL Engine evaluates the inner most query and then moves to the next level (outer query). So, in the above example inner query i.e. Select max(Salary) from Employee is evaluated first. This query will return a value of 7500 (based on the sample data shown as above). This value is substituted in the outer query and it is evaluated as:

Select   *   from  Employee  where  salary  =  ( 7500 )
/*
Returns: 

Eid    Name    Salary
5    steven    7500
*/

 

If the same syntax is applied to find out the 2nd or 3rd or 4th level of salary, the query would become bit complex to understand. See the example below:

 

Select   *   from  Employee  where  salary  =  
     (
Select   max (Salary)  from  Employee  where  salary 
    
<  ( Select   max (Salary)  from  Employee  where  
        Salary 
<  ( Select   max (Salary)  from  Employee  where  
               Salary 
< …………………………………………… N

The above query would go on and on, depending on the level of salary that is to be determined. As mentioned earlier, the SQL Engine evaluates the inner most query first and moves the next outer level. One wouldn’t want to write such a big query just to find out this simple information.

The same result can be achieved with a simple syntax and easily understandable logic, by using a CORRELATED SUBQUERY. This article doesn’t explain about correlated sub-query as it is out of scope of this article. (You may want to take a quick look on CORRELATED SUBQUERY.) As a "Rule of Thumb" keep these points in mind, when you use a correlated sub-query

  1. Correlated sub-query is a performance overhead to the database server and so, you have to use it only if it is required
  2. Avoid using Correlated subquery on large tables, as the inner query is evaluated for each row of the outer query

Having said that, let’s look at the query that captures the Nth maximum value:

  
  
Select   *   From  Employee E1  Where
    (N
- 1 =  ( Select   Count ( Distinct (E2.Salary))  From  Employee E2  Where
        E2.Salary 
>  E1.Salary)

(Where N is the level of Salary to be determined)

In the above example, the inner query uses a value of the outer query in its filter condition meaning; the inner query cannot be evaluated before evaluating the outer query. So each row in the outer query is evaluated first and the inner query is run for that row. Let’s look into the background process of this query, by substituting a value for N i.e. 4,(Idea is to find the 4th maximum salary):

  
  
Select   *   From  Employee E1  Where
    (
4 - 1 =  ( Select   Count ( Distinct (E2.Salary))  From  Employee E2  Where
        E2.Salary 
>  E1.Salary)

Since the outer query’s value is referred in the inner query, the operation is done row-by-row. Based on the sample data as shown above, the process starts with the following record:

  
  
/*
Employee E1
----------------------------------
Eid    Name    Salary
1    harry    3500
*/

The salary of this record is substituted in the inner query and evaluated as:

  
  
Select   Count ( Distinct (E2.Salary))  From  Employee E2 
    
Where  E2.Salary  >   3500

Above query returns 2 (as there are only 2 salaries greater than 3500). This value is substituted in the outer query and will be evaluated as:

  
  
Select   *   From  Employee E1  Where  ( 4 - 1 =  ( 2 )

The "where" condition evaluates to FALSE and so, this record is NOT fetched in the result.

Next the SQL Engine processes the 2nd record which is:

  
  
/*
Employee E1
----------------------------------
Eid    Name    Salary
2    jack    2500
*/

Now the inner query is evaluated as:

  
  
Select   Count ( Distinct (E2.Salary))  From  Employee E2 
    
Where  E2.Salary  >   2500

This query returns a value of 3 (as there are 3 salaries greater than 2500). The value is substituted in the outer query and evaluated as:

  
  
Select   *   From  Employee E1  Where  ( 4 - 1 =  ( 3 )

The "where" condition evaluates to TRUE and so, this record IS fetched in the result. This operation continues for all the remaining records. Finally the result shows these 2 records:

  
  
/*
Eid    Name    Salary
2    jack    2500
3    john     2500
*/

The above query works in the same manner in Oracle and Sybase as well. Applying the same logic, to find out the first maximum salary the query would be:

  
  
Select   *   From  Employee E1  Where
    (
1 - 1 =  ( Select   Count ( Distinct (E2.Salary))  From  Employee E2  Where
        E2.Salary 
>  E1.Salary)

If you are able to understand this functionality, you can workout various other queries in the same manner. The bottom line is, the query should be efficient and NOT resource hungry.

Conclusion

This example is the simplest representation of Correlated sub-query. In the real-time database manipulation, correlated sub-queries will much more extensive. 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值