IBM Support

MAX and MIN built-in functions

Question & Answer


Question

SUBSELECT with MAX versus MIN built-in function return different number of rows. The result can appear to be incorrect output, but it is not. The result is due to the data-dependency of the MAX and MIN built-in functions.

Answer

Example 1

Take the following simplified example:
You have a table: T1( C1 INT, C2 INT, C3 CHAR(1) )
with the following data:

                                                
C1    C2    C3
1,    1,    A
2,    1,    A
2,    2,    B
  • Query 1:
    SELECT COUNT(DISTINCT C1)                                     
       FROM T1                                          
       WHERE  C3='A'

    which will return 2 rows
  • Query 2:
    SELECT  COUNT(*)                                              
       FROM   T1 A                                          
       WHERE  C3 = 'A'   AND                                
       C2 =  ( SELECT MAX( X.C2 )                            
       FROM T1 X  WHERE  A.C1 = X.C1 )  
       
    which will return 0 (zero) rows because only rows #1 and #2 will qualify C3 = 'A', and for the row #2, C2 is not equal to MAX(C2) (which is 2).
  • Query 3:
    SELECT COUNT(*)                                           
       FROM   T1 A                                          
       WHERE  C3 = 'A'   AND                                
       C2 =  ( SELECT MIN( X.C2 )                            
       FROM T1 X  WHERE  A.C1 = X.C1 )

    which will return 2 rows

Example 2

The following example is a user's example that we do not have the complete data to:


    SELECT COUNT(*)  
       FROM TBP11111 A  
       WHERE A.DME =(SELECT MAX(X.DME) FROM TBP28414 X                    
         WHERE A.SYS=X.SYS)  


The output is:

15113    
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                                  
 SELECT COUNT(*)
 FROM ANNA.
TBP A                                                  
 WHERE A.DME =(SELECT MAX(X.DME) FROM DWADM.FLTTEM6 X              
 WHERE A.SYS=X.SYS)                                          
 And obtains 15111 rows
                                       

Tables TBP11111 and ANNA.TBP are the same table, and is the table created from table ANNA.FLTTEM6, selecting only two columns (SYS and DME).


The difference in the query is that customer in the subquery uses the original table ANNA.FLTTEM6.



SELECT COUNT(*)                                                                
FROM ANNA.FLTTEM6 A                                                      
WHERE STR IN ('N' , 'D' , 'A' , 'T') AND                                      
A.DME = (SELECT MAX(X.DME)                                                    
FROM ANNA.FLTTEM6 X WHERE A.SYS=X.SYS)                                  
---------+---------+---------+---------+
DSNE610I NUMBER OF ROWS DISPLAYED IS 1  


If in the subquery you change MAX with MIN the query returns a bigger number of rows:

                                          
SELECT COUNT(*)                                                  
FROM ANNA.FLTTEM6 A                                                      
WHERE STR IN ('N' , 'D' , 'A' , 'T') AND                                      
A.DME = (SELECT MIN(X.DME)                                                    
FROM ANNA.FLTTEM6 X WHERE A.SYS=X.SYS)              

---------+---------+---------+---------+
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                                        
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100

As you can see, the number of rows returned is different, but the user expects to have the same number of rows. To understand which of the above output is correct, the user supplied the following query:

SELECT COUNT(DISTINCT SYS)                                           
FROM ANNA.FLTTEM6                                                        
WHERE STR IN ('N' , 'D' , 'A' , 'T')                                          
---------+---------+---------+---------+

     
Looking at this, you see that the query with MIN in the subquery returns the correct number of row, while the query with MAX returns an incorrect output.

To make the query run, use the following command:

RUNSTATS TABLESPACE DBGERRI.TSP11111 TABLE(ANNA.FLTTEM6)                
COLUMN(ALL) INDEX(ALL)


If you look at the data and use different queries you will see that the failing case is exactly the same one. The user has some rows with the same SYS, but only some of them are qualified with 'IN' predicate and the ones that do or do not qualify the MAX(C2) predicate.

Depending on the nature of the application and what you want to produce, you could use the same predicate in the subquery, or just expect that the MAX and MIN function results will be different depending on the data.

The example provided below, can help solve this query problem:

                                                            
SELECT COUNT(*)                                            
    FROM ANNA.FLTTEM6 A                                      
    WHERE STR IN ('N' , 'D' , 'A' , 'T')  AND                      
    A.DME = (SELECT MAX(X.DME)                                      
    FROM ANNA.FLTTEM6 X                              
    WHERE A.SYS = X.SYS AND                                
    STR IN ('N' , 'D' , 'A' , 'T')  )

[{"Product":{"code":"SSEPEK","label":"Db2 for z\/OS"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"RDS","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"7.0;8.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21106580