[QDP] Update benchmark_throughput to batch encoding #796

400Ping · 2026-01-05T11:23:31Z

Purpose of PR

Update throughput benchmark to batch encoding

Related Issues or PRs

Closes #795

Changes Made

Breaking Changes

Yes
No

Checklist

Added or updated unit tests for all changes
Added or updated documentation for all changes
Successfully built and ran all unit tests or manual tests locally
PR title follows "MAHOUT-XXX: Brief Description" format (if related to an issue)
Code follows ASF guidelines

Signed-off-by: 400Ping <fourhundredping@gmail.com>

400Ping · 2026-01-05T12:27:04Z

Before(dev-qdp):

$ python ./qdp-python/benchmark/benchmark_throughput.py
Generating 12800 samples of 16 qubits...
  Batch size   : 64
  Vector length: 65536
  Batches      : 200
  Prefetch     : 16
  Frameworks   : pennylane, qiskit, mahout
  Generated 12800 samples
  PennyLane/Qiskit format: 6400.00 MB
  Mahout format: 6400.00 MB

======================================================================
DATALOADER THROUGHPUT BENCHMARK: 16 Qubits, 12800 Samples
======================================================================

[PennyLane] Full Pipeline (DataLoader -> GPU)...
/home/jay/work/mahout/qdp/./qdp-python/benchmark/benchmark_throughput.py:170: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:309.)
  state_gpu = state_cpu.to("cuda", dtype=torch.float32)
  Total Time: 6.7562 s (1894.6 vectors/sec)

[Qiskit] Full Pipeline (DataLoader -> GPU)...

        
  Total Time: 848.2128 s (15.1 vectors/sec)

[Mahout] Full Pipeline (DataLoader -> GPU)...
  IO + Encode Time: 9.7979 s
  Total Time: 9.7979 s (1306.4 vectors/sec)

======================================================================
THROUGHPUT (Higher is Better)
Samples: 12800, Qubits: 16
======================================================================
PennyLane        1894.6 vectors/sec
Mahout           1306.4 vectors/sec
Qiskit             15.1 vectors/sec
----------------------------------------------------------------------
Speedup vs PennyLane:       0.69x
Speedup vs Qiskit:         86.57x

After:

$ python ./qdp-python/benchmark/benchmark_throughput.py
Generating 12800 samples of 16 qubits...
  Batch size   : 64
  Vector length: 65536
  Batches      : 200
  Prefetch     : 16
  Frameworks   : pennylane, qiskit, mahout
  Generated 12800 samples
  PennyLane/Qiskit format: 6400.00 MB
  Mahout format: 6400.00 MB

======================================================================
DATALOADER THROUGHPUT BENCHMARK: 16 Qubits, 12800 Samples
======================================================================

[PennyLane] Full Pipeline (DataLoader -> GPU)...
/home/jay/work/mahout/qdp/./qdp-python/benchmark/benchmark_throughput.py:169: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at /pytorch/aten/src/ATen/native/Copy.cpp:309.)
  state_gpu = state_cpu.to("cuda", dtype=torch.float32)
  Total Time: 6.7298 s (1902.0 vectors/sec)

[Qiskit] Full Pipeline (DataLoader -> GPU)...

  Total Time: 854.9839 s (15.0 vectors/sec)

[Mahout] Full Pipeline (DataLoader -> GPU)...
  IO + Encode Time: 3.6776 s
  Total Time: 3.6776 s (3480.5 vectors/sec)

======================================================================
THROUGHPUT (Higher is Better)
Samples: 12800, Qubits: 16
======================================================================
Mahout           3480.5 vectors/sec
PennyLane        1902.0 vectors/sec
Qiskit             15.0 vectors/sec
----------------------------------------------------------------------
Speedup vs PennyLane:       1.83x
Speedup vs Qiskit:        232.48x

400Ping · 2026-01-05T12:27:26Z

cc @guan404ming @rich7420 @ryankert01

ryankert01

lg

* [Core] Update throughput benchmark to batch encoding Signed-off-by: 400Ping <fourhundredping@gmail.com> * fix conflict Signed-off-by: 400Ping <fourhundredping@gmail.com> --------- Signed-off-by: 400Ping <fourhundredping@gmail.com>

[Core] Update throughput benchmark to batch encoding

541a1ce

Signed-off-by: 400Ping <fourhundredping@gmail.com>

400Ping changed the title ~~[Core] Update throughput benchmark to batch encoding~~ [QDP] Update throughput benchmark to batch encoding Jan 5, 2026

400Ping changed the title ~~[QDP] Update throughput benchmark to batch encoding~~ [QDP] Update benchmark_throughput to batch encoding Jan 5, 2026

400Ping marked this pull request as draft January 5, 2026 11:32

400Ping added 2 commits January 5, 2026 19:36

fix conflict

2b755a3

Signed-off-by: 400Ping <fourhundredping@gmail.com>

Merge branch 'dev-qdp' into qdp/fix-throughput-benchmark

1b947da

400Ping marked this pull request as ready for review January 5, 2026 11:39

400Ping marked this pull request as draft January 5, 2026 12:10

400Ping marked this pull request as ready for review January 5, 2026 12:27

guan404ming approved these changes Jan 5, 2026

View reviewed changes

ryankert01 approved these changes Jan 5, 2026

View reviewed changes

guan404ming merged commit 581f5ee into apache:dev-qdp Jan 5, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QDP] Update benchmark_throughput to batch encoding #796

[QDP] Update benchmark_throughput to batch encoding #796

400Ping commented Jan 5, 2026

Uh oh!

400Ping commented Jan 5, 2026

Uh oh!

400Ping commented Jan 5, 2026

Uh oh!

ryankert01 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[QDP] Update benchmark_throughput to batch encoding #796

[QDP] Update benchmark_throughput to batch encoding #796

Conversation

400Ping commented Jan 5, 2026

Purpose of PR

Related Issues or PRs

Changes Made

Breaking Changes

Checklist

Uh oh!

400Ping commented Jan 5, 2026

Uh oh!

400Ping commented Jan 5, 2026

Uh oh!

ryankert01 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants