Question & Answer
Question
SUBSELECT with MAX versus MIN built-in function return different number of rows. The result can appear to be incorrect output, but it is not. The result is due to the data-dependency of the MAX and MIN built-in functions.
Answer
Example 1
Take the following simplified example:
You have a table: T1( C1 INT, C2 INT, C3 CHAR(1) )
with the following data:
C1 C2 C3
1, 1, A
2, 1, A
2, 2, B
- Query 1:
SELECT COUNT(DISTINCT C1)
FROM T1
WHERE C3='A'
which will return 2 rows - Query 2:
SELECT COUNT(*)
FROM T1 A
WHERE C3 = 'A' AND
C2 = ( SELECT MAX( X.C2 )
FROM T1 X WHERE A.C1 = X.C1 )
which will return 0 (zero) rows because only rows #1 and #2 will qualify C3 = 'A', and for the row #2, C2 is not equal to MAX(C2) (which is 2). - Query 3:
SELECT COUNT(*)
FROM T1 A
WHERE C3 = 'A' AND
C2 = ( SELECT MIN( X.C2 )
FROM T1 X WHERE A.C1 = X.C1 )
which will return 2 rows
Example 2
The following example is a user's example that we do not have the complete data to:
SELECT COUNT(*)
FROM TBP11111 A
WHERE A.DME =(SELECT MAX(X.DME) FROM TBP28414 X
WHERE A.SYS=X.SYS)
The output is:
15113
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
SELECT COUNT(*)
FROM ANNA.TBP A
WHERE A.DME =(SELECT MAX(X.DME) FROM DWADM.FLTTEM6 X
WHERE A.SYS=X.SYS)
And obtains 15111 rows
Tables TBP11111 and ANNA.TBP are the same table, and is the table created from table ANNA.FLTTEM6, selecting only two columns (SYS and DME).
The difference in the query is that customer in the subquery uses the original table ANNA.FLTTEM6.
SELECT COUNT(*)
FROM ANNA.FLTTEM6 A
WHERE STR IN ('N' , 'D' , 'A' , 'T') AND
A.DME = (SELECT MAX(X.DME)
FROM ANNA.FLTTEM6 X WHERE A.SYS=X.SYS)
---------+---------+---------+---------+
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
If in the subquery you change MAX with MIN the query returns a bigger number of rows:
SELECT COUNT(*)
FROM ANNA.FLTTEM6 A
WHERE STR IN ('N' , 'D' , 'A' , 'T') AND
A.DME = (SELECT MIN(X.DME)
FROM ANNA.FLTTEM6 X WHERE A.SYS=X.SYS)
---------+---------+---------+---------+
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100
As you can see, the number of rows returned is different, but the user expects to have the same number of rows. To understand which of the above output is correct, the user supplied the following query:
SELECT COUNT(DISTINCT SYS)
FROM ANNA.FLTTEM6
WHERE STR IN ('N' , 'D' , 'A' , 'T')
---------+---------+---------+---------+
Looking at this, you see that the query with MIN in the subquery returns the correct number of row, while the query with MAX returns an incorrect output.
To make the query run, use the following command:
RUNSTATS TABLESPACE DBGERRI.TSP11111 TABLE(ANNA.FLTTEM6)
COLUMN(ALL) INDEX(ALL)
If you look at the data and use different queries you will see that the failing case is exactly the same one. The user has some rows with the same SYS, but only some of them are qualified with 'IN' predicate and the ones that do or do not qualify the MAX(C2) predicate.
Depending on the nature of the application and what you want to produce, you could use the same predicate in the subquery, or just expect that the MAX and MIN function results will be different depending on the data.
The example provided below, can help solve this query problem:
SELECT COUNT(*)
FROM ANNA.FLTTEM6 A
WHERE STR IN ('N' , 'D' , 'A' , 'T') AND
A.DME = (SELECT MAX(X.DME)
FROM ANNA.FLTTEM6 X
WHERE A.SYS = X.SYS AND
STR IN ('N' , 'D' , 'A' , 'T') )
Was this topic helpful?
Document Information
Modified date:
16 June 2018
UID
swg21106580