-
Notifications
You must be signed in to change notification settings - Fork 272
Open
Description
Describe the bug
Using the PySpark benchmark in the repo, I am comparing logging and metrics for JVM vs native shuffle.
JVM shuffle spills 96 times:
26/01/15 13:48:46 INFO CometShuffleExternalSorter: Thread 98 spilling sort data of 512.0 MiB to disk (1 time so far)
26/01/15 13:48:49 INFO CometShuffleExternalSorter: Thread 82 spilling sort data of 512.0 MiB to disk (2 times so far)
26/01/15 13:48:49 INFO CometShuffleExternalSorter: Thread 95 spilling sort data of 512.0 MiB to disk (2 times so far)
26/01/15 13:48:49 INFO CometShuffleExternalSorter: Thread 104 spilling sort data of 512.0 MiB to disk (2 times so far)
26/01/15 13:48:49 INFO CometShuffleExternalSorter: Thread 106 spilling sort data of 512.0 MiB to disk (2 times so far)
...
Native shuffle spills 32 times:
26/01/15 15:42:36 INFO core/src/execution/shuffle/shuffle_writer.rs: ShuffleRepartitioner spilling shuffle data of 532719016 to disk while inserting (0 time(s) so far)
26/01/15 15:42:36 INFO core/src/execution/shuffle/shuffle_writer.rs: ShuffleRepartitioner spilling shuffle data of 532094760 to disk while inserting (0 time(s) so far)
26/01/15 15:42:36 INFO core/src/execution/shuffle/shuffle_writer.rs: ShuffleRepartitioner spilling shuffle data of 532772904 to disk while inserting (0 time(s) so far)
26/01/15 15:42:37 INFO core/src/execution/shuffle/shuffle_writer.rs: ShuffleRepartitioner spilling shuffle data of 532772904 to disk while inserting (0 time(s) so far)
26/01/15 15:42:37 INFO core/src/execution/shuffle/shuffle_writer.rs: ShuffleRepartitioner spilling shuffle data of 532719208 to disk while inserting (0 time(s) so far)
...
Steps to reproduce
No response
Expected behavior
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working