[CALCITE-6451] Improve Nullability Derivation for Intersect and Minus#4897
[CALCITE-6451] Improve Nullability Derivation for Intersect and Minus#4897xiedeyantu wants to merge 3 commits intoapache:mainfrom
Conversation
Co-authored-by: Victor Barua <victor.barua@datadoghq.com>
|
Related PR #3845. |
|
you have some checker failures |
|
Does this work around the problems in the other PR? |
Are you referring to #3845? I noticed that you had approved this PR before, but there were some conflicts. Since it's been a long time, the CI status is no longer visible, and it's unclear if there were other issues back then. I think it's a good PR, so I’m trying to finish it up. |
|
Yes, the discussion in JIRA was about causing problems with other rules. |
I didn’t see any discussion in the Jira. Are you referring to the discussion in the original PR? I have resolved the rule conflicts. |
|
yes, the original PR |
|
According to this disscusion #3845 (comment) . |
|
|
@mihaibudiu I'm not sure if you agree with the current simplified processing logic. If you have time, please review this PR to see if there are any other concerns. |



Jira Link
CALCITE-6451
Changes Proposed
SetOp overrides
deriveRowType()and computes the output row type to be the least restrictive across all inputs here.So for example given
Input 1: (I64, I64, I64?, I64?)
Input 2: (I64, I64?, I64, I64?)
where ? denotes nullable, the least restrictive output computes:
Output: (I64, I64?, I64?, I64?)
For UNION operations, these nullabilities are accurate.
However for MINUS and INTERSECT there is room for improvement.
MINUS only returns rows from the first input, as such its output nullability should always match that of its first input:
Output: (I64, I64, I64?, I64?)
INTERSECT only returns rows that match across all inputs. If a column is not nullable in any of the inputs, then it is not nullable in the output because no rows can be emitted in which that column is null:
Output: (I64, I64, I64, I64?)