Skip to content

Conversation

@miland-db
Copy link
Contributor

What changes were proposed in this pull request?

Added a null-safe findPivotIndex lookup method that returns -1 for null keys on the TreeMap path, since null can never be a valid TreeMap key. The HashMap path (atomic types) is unchanged, as it handles null keys safely and allows null as a valid pivot value.

Why are the changes needed?

When a PIVOT query uses a non-atomic pivot column (struct, array), PivotFirst stores pivot values in a TreeMap with a comparison-based ordering. If the pivot column contains null values (e.g., from a GROUP BY null group), the TreeMap.getOrElse lookup calls compare(null, existingKey), which throws a NullPointerException.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added unit test in DataFramePivotSuite.scala.

Was this patch authored or co-authored using generative AI tooling?

No.
Please refer to the ASF Generative Tooling Guidance for details.
-->

@miland-db miland-db changed the title Init commit [SPARK-55483] Fix NPE in PivotFirst when pivot column is a non-atomic type with null values Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants