Introduction :
In this post we will see, how one can get top N records category wise. I have made the title of this post as subject wise top N marks secured by students, because I found this is the most asked questions in various forums and blogs for this kind of questions and thus I have decided to show the demonstration using example for the same scenario.
Objective :
The objective of this post is to use Common Table Expression (CTE) to resolve our problem i.e, SELECT TOP N records category wise . Then we will analyse it line-by-line.
Table Structure :
To demonstrate the example, we need a table with following stucture
Column Name |
Data Type |
MarksId | int |
ExamId | int |
StudentId | int |
SubjectId | int |
MarksSecured | float |
For a quick try, you can run the following script to create a table with the above structure.
CREATE TABLE [dbo].[Marks]( [MarksId] [int] NOT NULL, [ExamId] [int] NULL, [StudentId] [int] NULL, [SubjectId] [int] NULL, [MarksSecured] [float] NULL, CONSTRAINT [PK_ExamMarks] PRIMARY KEY CLUSTERED ( [MarksId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
Sample Data :
Yes, We need some sample data to check whether our query is working or not ! You can manually insert some data to the table but if you are in hurry then you can execute following script.
/*For subject 1*/ INSERT INTO dbo.Marks VALUES(1,1,1,1,80) INSERT INTO dbo.Marks VALUES(2,1,2,1,70) INSERT INTO dbo.Marks VALUES(3,1,3,1,90) INSERT INTO dbo.Marks VALUES(4,1,4,1,82) INSERT INTO dbo.Marks VALUES(5,1,5,1,60) /*For subject 2*/ INSERT INTO dbo.Marks VALUES(6,1,1,2,40) INSERT INTO dbo.Marks VALUES(7,1,2,2,70) INSERT INTO dbo.Marks VALUES(8,1,3,2,50) INSERT INTO dbo.Marks VALUES(9,1,4,2,90) INSERT INTO dbo.Marks VALUES(10,1,5,2,85) /*For subject 3*/ INSERT INTO dbo.Marks VALUES(11,1,1,3,92) INSERT INTO dbo.Marks VALUES(12,1,2,3,70) INSERT INTO dbo.Marks VALUES(13,1,3,3,30) INSERT INTO dbo.Marks VALUES(14,1,4,3,60) INSERT INTO dbo.Marks VALUES(15,1,5,3,44)
Demonstration :
Interesting !!!
Our demonstration section consists only a single code/query snippet which is a CTE.
WITH CTETopMarks AS ( SELECT ROW_NUMBER() OVER ( PARTITION BY SubjectId ORDER BY MarksSecured DESC ) AS Sl_No, MarksId, StudentId, SubjectId, MarksSecured FROM dbo.Marks ) SELECT SubjectId, StudentId, MarksSecured FROM CTETopMarks WHERE Sl_No <= 2
The expected result of this query is top 2 marks secured by students for each subject.
Result :
SubjectId |
StudentId |
MarksSecured |
1 |
3 |
90 |
1 |
4 |
82 |
2 |
4 |
90 |
2 |
5 |
85 |
3 |
1 |
92 |
3 |
2 |
70 |
This is the result which we were anticipating.
Analysis :
Let try to understand the solution script.
To understand it, we need to know
->Common Table Expression(CTE)
->Row_Number()
You can read these in details in MSDN.
Common Table Expression(CTE) :
A CTE consists of an expression name for the CTE, a query defining the CTE and a SELECT statement to run the CTE.
Syntax:
WITH expression_name
AS
( CTE_query_definition )
SELECT <column_list>
FROM expression_name;
ROW_NUMBER() :
We use this to generate row no. for each row based on a ORDER BY clause for each category defined by PARTITION BY.
Syntax:
ROW_NUMBER ( ) OVER ( [ <partition_by_clause> ] <order_by_clause> )
Let execute only the query definition of the CTE and see the result-
SELECT ROW_NUMBER() OVER ( PARTITION BY SubjectId ORDER BY MarksSecured DESC ) AS Sl_No, MarksId, StudentId, SubjectId, MarksSecured FROM dbo.Marks
Sl_No | MarksId | StudentId | SubjectId | MarksSecured |
1 |
3 |
3 |
1 |
90 |
2 |
4 |
4 |
1 |
82 |
3 |
1 |
1 |
1 |
80 |
4 |
2 |
2 |
1 |
70 |
5 |
5 |
5 |
1 |
60 |
1 |
9 |
4 |
2 |
90 |
2 |
10 |
5 |
2 |
85 |
3 |
7 |
2 |
2 |
70 |
4 |
8 |
3 |
2 |
50 |
5 |
6 |
1 |
2 |
40 |
1 |
11 |
1 |
3 |
92 |
2 |
12 |
2 |
3 |
70 |
3 |
14 |
4 |
3 |
60 |
4 |
15 |
5 |
3 |
44 |
5 |
13 |
3 |
3 |
30 |
Here, records are partitioned by SubjectId and the records under each partition/SubjectId is sorted by descending order, generating a sequential no i.e, Sl_No.
Now, we only need to filter the result with a condition to show only top 2 marks from each SubjectId. To do this, we have used CTE with a WHERE condition which will filter Sl_No those are less or equal to 2.
Conclusion :
We understood basic syntax of CTE and ROW_NUMBER() in SQL Server. Now we are able to resolve a problem where we need to show N numbers of records for each category. Someone can ask that why there is a field in the table, named ExamId although we are not using it anywhere ?? You can insert more data to the table with different ExamIds and change the CTE accordingly to see result for different examinations.
I hope this article will be useful to you.
Thanks for reading.
Please put your valuable comments and suggestions.