Details
Description
I'm getting some strange results from a specific pattern of queries related to finding a count of columns grouped in two dimensions.
The following query works as I would expect:
select e.PART_NAME as ENGINE, b.PART_NAME as BODY, count(distinct m.ID) as NUM_MODELS from CARS.MODELS m
left outer join CARS.PART_NAMES e on m.ENGINE_ID=e.ID
left outer join CARS.PART_NAMES b on m.BODY_ID=b.ID
group by e.PART_NAME, b.PART_NAME
This returns something like:
ENGINE BODY NUM_MODELS
electric compact 1
gas compact 2
gas sedan 1
gas truck 2
hybrid compact 1
hybrid sedan 2
So this fictitious car company sells 2 different hybrid sedans, one gas sedan etc.
If I add a filter to the query that should not actually change the output, I see output that doesn't make sense.
This query filters out any car whose top speed is less than 50 (and all cars have a top speed higher than this):
select e.PART_NAME as ENGINE, b.PART_NAME as BODY, count(distinct m.ID) as NUM_MODELS from CARS.MODELS m
left outer join CARS.PART_NAMES e on m.ENGINE_ID=e.ID
left outer join CARS.PART_NAMES b on m.BODY_ID=b.ID
left outer join CARS.TOP_SPEED s on m.ID=s.ID
where s.SPEED>50
group by e.PART_NAME, b.PART_NAME
The results show the wrong values in column 2:
ENGINE BODY NUM_MODELS
electric electric 1
gas gas 2
gas gas 1
gas gas 2
hybrid hybrid 1
hybrid hybrid 2
I've tried the same query on DB2 with the same data and I get the results that I expect – that is, both queries return the same result that I showed on the first query here.
I'll attach a script that creates a database with the sample data used above.