GROUP BY clause
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual rows.
When you specify a GROUP BY clause, SQL divides the selected rows into groups such that the rows of each group have matching values in one or more columns or expressions. Next, SQL processes each group to produce a single-row result for the group. You can specify one or more columns or expressions in the GROUP BY clause to group the rows. The items you specify in the SELECT statement are properties of each group of rows, not properties of individual rows in a table or view.
Without a GROUP BY clause, the application of SQL aggregate functions returns one row. When GROUP BY is used, the function is applied to each group, thereby returning as many rows as there are groups.
For example, the CORPDATA.EMPLOYEE table has several sets of rows, and each set consists of rows describing members of a specific department. To find the average salary of people in each department, you can issue:
SELECT WORKDEPT, DECIMAL (AVG(SALARY),5,0)
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT
The result is several rows, one for each department.
WORKDEPT | AVG-SALARY |
---|---|
A00 | 40850 |
B01 | 41250 |
C01 | 29722 |
D11 | 25147 |
D21 | 25668 |
E01 | 40175 |
E11 | 21020 |
E21 | 24086 |
- Grouping the rows does not mean ordering them. Grouping puts each selected row in a group, which SQL then processes to derive characteristics of the group. Ordering the rows puts all the rows in the results table in ascending or descending collating sequence. Depending on the implementation selected by the database manager, the resulting groups might appear to be ordered.
- If there are null values in the column you specify in the GROUP BY clause, a single-row result is produced for the data in the rows with null values.
- If the grouping occurs over character, or UCS-2 or UTF-16 graphic columns, the sort sequence in effect when the query is run is applied to the grouping.
When you use GROUP BY, you list the columns or expressions you want SQL to use to group the rows. For example, suppose that you want a list of the number of people working on each major project described in the CORPDATA.PROJECT table. You can issue:
SELECT SUM(PRSTAFF), MAJPROJ
FROM CORPDATA.PROJECT
GROUP BY MAJPROJ
The result is a list of the company's current major projects and the number of people working on each project.
SUM(PRSTAFF) | MAJPROJ |
---|---|
6 | AD3100 |
5 | AD3110 |
10 | MA2100 |
8 | MA2110 |
5 | OP1000 |
4 | OP2000 |
3 | OP2010 |
32.5 | ? |
You can also specify that you want the rows grouped by more than one column or expression. For example, you can issue a select statement to find the average salary for men and women in each department, using the CORPDATA.EMPLOYEE table. To do this, you can issue:
SELECT WORKDEPT, SEX, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT, SEX
The result follows.
WORKDEPT | SEX | AVG_WAGES |
---|---|---|
A00 | F | 49625 |
A00 | M | 35000 |
B01 | M | 41250 |
C01 | F | 29722 |
D11 | F | 25817 |
D11 | M | 24764 |
D21 | F | 26933 |
D21 | M | 24720 |
E01 | M | 40175 |
E11 | F | 22810 |
E11 | M | 16545 |
E21 | F | 25370 |
E21 | M | 23830 |
Because you did not include a WHERE clause in this example, SQL examines and processes all rows in the CORPDATA.EMPLOYEE table. The rows are grouped first by department number and next (within each department) by sex before SQL derives the average SALARY value for each group.