Skip to content

Conversation

@andygrove
Copy link
Member

Summary

  • Fix CometBroadcastExchangeExec to properly use AQE's coalesced shuffle partition specs
  • When AQEShuffleReadExec wraps a CometShuffleExchangeExec, use getShuffleRDD(partitionSpecs) to get the coalesced RDD instead of bypassing AQE

Problem

When AQE (Adaptive Query Execution) coalesces shuffle partitions based on runtime statistics, CometBroadcastExchangeExec was bypassing this optimization by reading directly from the inner shuffle plan. This caused broadcast exchanges to spawn many tasks (e.g., 200) even when AQE determined that only 1 task was needed due to small data volume after filtering.

For example, in TPC-H Q18, after the filter sum(l_quantity) > 313, there are only ~900 rows to broadcast. AQE correctly identifies this and coalesces 200 shuffle partitions into 1. However, Comet's broadcast exchange was ignoring this and still reading from all 200 original partitions.

Solution

Added proper handling for AQEShuffleReadExec wrapping CometShuffleExchangeExec:

  1. Extract partition specs from AQEShuffleReadExec
  2. Use CometShuffleExchangeExec.getShuffleRDD(partitionSpecs) to get the coalesced RDD
  3. Serialize batches from the coalesced RDD for broadcasting

Test plan

  • Existing unit tests pass (CometExecSuite, CometShuffleSuite)
  • Verified with TPC-H Q18: broadcast stages now use 1 task instead of 200 when AQE coalesces partitions

🤖 Generated with Claude Code

…coalescing

When AQE (Adaptive Query Execution) coalesces shuffle partitions based on
runtime statistics, CometBroadcastExchangeExec was bypassing this optimization
by reading directly from the inner shuffle plan instead of using the coalesced
partition specs from AQEShuffleReadExec.

This caused broadcast exchanges to spawn many tasks (e.g., 200) even when AQE
determined that only 1 task was needed due to small data volume after filtering.

The fix adds proper handling for AQEShuffleReadExec wrapping CometShuffleExchangeExec:
- Extract partition specs from AQEShuffleReadExec
- Use CometShuffleExchangeExec.getShuffleRDD(partitionSpecs) to get the coalesced RDD
- Serialize batches from the coalesced RDD for broadcasting

This ensures broadcast exchanges respect AQE's partition coalescing decisions,
reducing unnecessary task scheduling overhead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@andygrove andygrove marked this pull request as draft January 21, 2026 19:43
@andygrove andygrove marked this pull request as ready for review January 21, 2026 19:46
@codecov-commenter
Copy link

codecov-commenter commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.01%. Comparing base (f09f8af) to head (ed7b8e0).
⚠️ Report is 864 commits behind head on main.

Files with missing lines Patch % Lines
...e/spark/sql/comet/CometBroadcastExchangeExec.scala 46.15% 4 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3235      +/-   ##
============================================
+ Coverage     56.12%   60.01%   +3.89%     
- Complexity      976     1426     +450     
============================================
  Files           119      170      +51     
  Lines         11743    15787    +4044     
  Branches       2251     2611     +360     
============================================
+ Hits           6591     9475    +2884     
- Misses         4012     4989     +977     
- Partials       1140     1323     +183     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbutrovich
Copy link
Contributor

Should we run TPC-H benchmarks on this one?

@mbutrovich mbutrovich self-requested a review January 22, 2026 16:55
@andygrove andygrove added this to the 0.13.0 milestone Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants