Wednesday, December 8, 2010

SQL Server : Situation to use GROUP BY ALL

An interesting question was raised by one of the attendee in my “SQL Server – Performance Tuning” session.

(i.e) Where GROUP BY ALL can be used in efficient manner ?

First of all, what is GROUP BY ALL?

GROUP BY ALL is similar to GROUP BY, but which includes all groups and result sets, even those that do not have any rows that meet the search condition specified in the WHERE clause.

Let us assume, we have two tables with the following records.

Table Creation Script :

CREATE TABLE [DBO].[City]
(
            [CityID]                INT                             IDENTITY(1,1)  NOT NULL,
            [CityName]          VARCHAR(100)                                    NOT NULL,
            CONSTRAINT       [PK_City] PRIMARY KEY ([CityID]),
)
GO

CREATE TABLE [DBO].[Employee]
(
             [EmpID]                INT                               IDENTITY(1,1)   NOT NULL,
             [EmpName]          VARCHAR(100)                                       NOT NULL,
             [Sex]                    BIT                                                         NOT NULL, --1 Male 0 Female
             [CityID]                INT                                                         NOT NULL,
             CONSTRAINT        [PK_Employee]   PRIMARY KEY ([EmpID]),
             CONSTRAINT        [FK_Employee_CityID] FOREIGN KEY ([CityID]) REFERENCES [City] ([CityID])
)
GO

Sample records for City:

INSERT INTO [City] ([CityName]) VALUES ('Chennai');
INSERT INTO [City] ([CityName]) VALUES ('Bangalore');
INSERT INTO [City] ([CityName]) VALUES ('Hyderabad');
INSERT INTO [City] ([CityName]) VALUES ('Pune');
INSERT INTO [City] ([CityName]) VALUES ('Mumbai');
INSERT INTO [City] ([CityName]) VALUES ('Delhi');
INSERT INTO [City] ([CityName]) VALUES ('Calcutta');

Sample records for Employee:

INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Siva',1,1);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Parvathi',0,1);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Swetha',0,4);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Swamy',1,4);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Joseph',1,6);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Gayatri',0,3);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Vivek',1,6);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Krishna',1,4);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Lakshmi',0,4);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Satheesh',1,5);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Kiran',1,3);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Suman',1,3);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Srinivas',1,5);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Priya',0,2);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Karthika',0,2);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Murali',1,5);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Vishal',1,5);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Sravya',0,3);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Shilpa',0,4);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Vishnu',1,5);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Ganapathy',1,1);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Matt',1,7);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Jonathan',1,7);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Tim',1,7);
INSERT INTO [DBO].[Employee] ([EmpName],[Sex],[CityID]) VALUES ('Arvind',1,7);

I hope, you have executed the above scripts without any errors.

Now, execute the below statement to list all the cities from [City] table.

SELECT [CityName] FROM [DBO].[City] WITH (NOLOCK);

Output would be :













Let us say, we need to check how many FEMALE employees are there in each city. For this we normally use GROUP BY clause.

SELECT
              C.[CityName] AS 'City Name',
              COUNT(E.EmpID) AS 'Female Count'
FROM
              [DBO].[Employee] AS E WITH (NOLOCK) INNER JOIN [DBO].[City] AS C WITH (NOLOCK) ON E.[CityID] = C.[CityID]
WHERE
              E.[Sex] = 0
GROUP BY
              C.[CityName]
ORDER BY
              C.[CityName]
GO

Output would be :









When you execute the above statement will give us only the city names which are having female employee. It’s because we are filtering only FEMALE employee in the WHERE CLAUSE.

But, we need to list all the city names even FEMALE employees are not existing. In this situation GROUP BY ALL is very useful.

SELECT
             C.[CityName] AS 'City Name',
             COUNT(E.EmpID) AS 'Female Count'
FROM
             [DBO].[Employee] AS E WITH (NOLOCK) INNER JOIN [DBO].[City] AS C WITH (NOLOCK) ON E.[CityID] = C.[CityID]
WHERE
             E.[Sex] = 0
GROUP BY ALL
            C.[CityName]
ORDER BY
            C.[CityName]
GO

Output would be :














From the above result set, we can understand how GROUP BY ALL is used efficiently.

NOTE :

1. We cannot specify ALL with the CUBE or ROLLUP operators.
2. GROUP BY ALL is not supported in queries that access remote tables if there is also a WHERE clause in the query.

3 comments:

  1. Hi Siva,
    Very Good Article. And useful as well in many cases.

    Thanks
    Joseph

    ReplyDelete
  2. Thanks for the post Siva. It helped us to know clearly the difference between Group By and Group By All Clauses.

    ReplyDelete
  3. Good to learn.
    Exciting to implement..
    Eager to get output...

    Totally very interesting to check additional features of sql server.

    Thanks Siva. Keep updating our brains......

    ReplyDelete