[SPARK-55461][SQL] Improve AQE Coalesce Grouping warning messages when numOfPartitions of ShuffleStages in the same coalesce group are not equal #54242
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Spark Adaptive Query Execution(AQE) framework coalesces the eligible shuffle partitions (e.g: empty/small-sized shuffle partitions) during the query execution. This optimization requires to be created coalesce groups for related Shuffle Stages (e.g:
SortMergeJoincan have 1ShuffleQueryStageper join-leg) to guarantee that both SMJ legs having the same num of partitions. To create coalesce groups for relatedShuffleStages, Spark Plan Tree needs to be traversed by findingShuffleQueryStages. SPARK-46590 has fixed incorrect coalesce grouping problem by addingBinaryExecNodeSupport for SparkPlan Tree traversal. This PR aims to introduce following complementary improvements on the top of SPARK-46590:1- Adding warning log message to
ShufflePartitionsUtil.coalescePartitionsWithoutSkew()when numOfPartitions of ShuffleStages in the same coalesce group are not equal. This is required for the consistency becauseShufflePartitionsUtil.coalescePartitionsWithSkew()logs warning message for the same case,2- Adding problematic shuffleStageIds to warning messages when numOfPartitions of ShuffleStages in the same coalesce group are not equal. This info can help for troubleshooting.
3- Aligning the warning logs for specially for both
ShufflePartitionsUtil.coalescePartitionsWithoutSkew()andcoalescePartitionsWithSkew()cases4- 2 new UT cases are being added:
Current UT Cases cover following use cases:
This PR also adds following new UT cases:
4.1- skewed SortMergeJoin under Union under BroadcastHashJoin,
4.2- non-skewed SortMergeJoin under Union under BroadcastHashJoin
5-
private def coalescePartitions()needs to be renamed because Scala does not allow the existence of default values in multiple overloaded methods. This causes following Scala compile-time problem:Why are the changes needed?
ShuffleStagesin the same coalesce group are not equal,ShuffleStageIdscan also help for the trouble shooting,Does this PR introduce any user-facing change?
Yes, adding new warning message when
numOfPartitionsofShuffleStagesin the same coalesce group are not equalHow was this patch tested?
Added 2 new UT cases for existing use-cases to test coalesce grouping logic such as:
Was this patch authored or co-authored using generative AI tooling?
No