Skip to content

Conversation

@julianmi
Copy link
Contributor

@julianmi julianmi commented Dec 2, 2025

This PR adds automatic partition count derivation for ACE (Augmented Core Extraction) graph builds based on available system memory.

Previously, users had to manually calculate and specify the number of partitions based on their dataset size and available memory. This was error-prone and required understanding the internal memory requirements of the ACE algorithm. With auto-derivation, users get optimal partitioning out of the box while still having the option to override if needed.

Changes

  • When npartitions is set to 0 (new default), ACE automatically calculates the optimal number of partitions based on available host and GPU memory.
  • When npartitions is set to a positive value, the specified count is used but may be automatically increased if it would exceed memory limits.
  • Added max_host_memory_gb and max_gpu_memory_gb parameters to allow users to constrain memory usage (useful for shared systems or testing).

This builds on top of PR #1597, which should be merged first.

- Added `cuvsHnswAceParams` structure for ACE configuration.
- Implemented `cuvsHnswBuild` function to facilitate index construction using ACE.
- Updated HNSW index parameters to include ACE settings.
- Created new tests for HNSW index building and searching using ACE.
- Updated documentation to reflect the new ACE parameters and usage.
- Add heuristic to automatically derive the number of partitions based on host and device memory requirements.
- Increase the user-profided `npartitions` if it does not fit memory.
- Introduced `max_host_memory_gb` and `max_gpu_memory_gb` fields to `cuvsAceParams` and `cuvsHnswAceParams` structures for controlling memory usage during ACE builds.
- Added tests to verify that small memory limits trigger disk mode correctly for both CAGRA and HNSW index builds.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 2, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

- Renamed parameter `m` to `M` in HNSW structures and related functions for consistency.
- Removed `ef_construction` from `cuvsHnswAceParams` and related classes, as it is no longer needed.
- Load the HNSW index from file before search if needed.
@julianmi julianmi marked this pull request as ready for review December 3, 2025 15:37
@julianmi julianmi requested review from a team as code owners December 3, 2025 15:37
Copy link
Member

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved CMake changes

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Julian for the PR, it is great to have the n_partition parameter automatically determined. I have reviewed C/C++/Python changes. I have suggestions for the formulas used for memory estimation, but in general the PR is in a good shape.

@tfeher
Copy link
Contributor

tfeher commented Jan 16, 2026

Thanks Julian for the update, I had one more comment for the workspace size estimate, otherwise the code looks good.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Julian for the updates! Although the memory size estimates could be still improved, I believe it is better to tweak them in a follow up PR. The rest of the PR look good to me.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My last review meant to approve the PR.

Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python/java changes look good!

npartitions=1,
npartitions=0,
ef_construction=120,
build_dir="/tmp/ace_build",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR: but I don't think we should be hardcoding the build_dir like this, and instead should be using something like tempfile.TemporaryDirectory . The user might not have /tmp be a valid directory, and this could cause issues with concurrency when two programs both try to access the same directory.

@tfeher
Copy link
Contributor

tfeher commented Jan 21, 2026

/merge

@rapids-bot rapids-bot bot merged commit efe7c97 into rapidsai:main Jan 21, 2026
189 of 193 checks passed
lowener pushed a commit to lowener/cuvs that referenced this pull request Jan 23, 2026
This PR adds automatic partition count derivation for ACE (Augmented Core Extraction) graph builds based on available system memory.

Previously, users had to manually calculate and specify the number of partitions based on their dataset size and available memory. This was error-prone and required understanding the internal memory requirements of the ACE algorithm. With auto-derivation, users get optimal partitioning out of the box while still having the option to override if needed.

**Changes**
- When `npartitions` is set to `0` (new default), ACE automatically calculates the optimal number of partitions based on available host and GPU memory.
- When `npartitions` is set to a positive value, the specified count is used but may be automatically increased if it would exceed memory limits.
- Added `max_host_memory_gb` and `max_gpu_memory_gb` parameters to allow users to constrain memory usage (useful for shared systems or testing).

This builds on top of PR rapidsai#1597, which should be merged first.

Authors:
  - Julian Miller (https://github.com/julianmi)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Ben Frederickson (https://github.com/benfred)

URL: rapidsai#1603
lowener added a commit to lowener/cuvs that referenced this pull request Jan 23, 2026
lowener added a commit to lowener/cuvs that referenced this pull request Jan 23, 2026
lowener pushed a commit to lowener/cuvs that referenced this pull request Jan 26, 2026
This PR adds automatic partition count derivation for ACE (Augmented Core Extraction) graph builds based on available system memory.

Previously, users had to manually calculate and specify the number of partitions based on their dataset size and available memory. This was error-prone and required understanding the internal memory requirements of the ACE algorithm. With auto-derivation, users get optimal partitioning out of the box while still having the option to override if needed.

**Changes**
- When `npartitions` is set to `0` (new default), ACE automatically calculates the optimal number of partitions based on available host and GPU memory.
- When `npartitions` is set to a positive value, the specified count is used but may be automatically increased if it would exceed memory limits.
- Added `max_host_memory_gb` and `max_gpu_memory_gb` parameters to allow users to constrain memory usage (useful for shared systems or testing).

This builds on top of PR rapidsai#1597, which should be merged first.

Authors:
  - Julian Miller (https://github.com/julianmi)
  - Tamas Bela Feher (https://github.com/tfeher)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Ben Frederickson (https://github.com/benfred)

URL: rapidsai#1603
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

5 participants