Auto select CAGRA build algorithom for hnsw::build #1719

tfeher · 2026-01-21T16:20:30Z

Configuring HNSW graph build using CAGRA is complicated, because CAGRA offers multiple build algorithms. This PR implements an automatic algorithm selection. The goal is to have a simplified API, where the user needs to set only two parameters that control graph size and quality (M and ef_construction respectively). This shall be familiar for HNSW users, and allows easier adaption of cuvs accelerated HNSW graph building.

  hnsw::index_params params;
  params.M               = 24;
  params.ef_construction = 200;
  params.hierarchy       = cuvs::neighbors::hnsw::HnswHierarchy::GPU;

  auto hnsw_index = hnsw::build(res, params, dataset_host_view);
  cuvs::neighbors::hnsw::serialize(res, "hnsw_index.bin", *hnsw_index);

If we have enough memory (host and GPU) to do both the KNN graph building and optimization in memory, then we choose in memory build, and let cagra::index_params::from_hnsw_params derive the additional configuration parameters.

If the build would require more memory then available, then we choose ACE method and let the number of partitions derived using #1603.

For host we query the os for available memory, for GPU it is assumed that the whole device memory is available.

copy-pr-bot · 2026-01-21T16:20:34Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

tfeher · 2026-01-21T17:50:39Z

cpp/src/neighbors/detail/cagra/cagra_helpers.cpp

+  size_t total_host =
+    graph_host_mem + host_workspace_size + 2e9;  // added 2 GB extra workspace (IVF-PQ search)
+  size_t total_dev =
+    std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9;  // addet 1 GB extra workspace size


This is still optimistic, we need to update and test before the PR can be merged.

tfeher · 2026-01-21T17:57:52Z

examples/cpp/src/hnsw_openai_example.cu

+  hnsw::index_params params;
+  params.M               = 24;
+  params.ef_construction = 200;
+  params.hierarchy       = cuvs::neighbors::hnsw::HnswHierarchy::GPU;


We should make this default #1617

tfeher · 2026-01-21T18:02:31Z

examples/cpp/src/hnsw_openai_example.cu

+  int64_t topk = 12;
+
+  // HNSW index parameters
+  hnsw::index_params params;


We need to figure out how to handle ace.build_dir in this setup: the user does not set ace_params. The algorithm is automatically selected. But if we happen to choose ace, then we need to know which disk space to use. Do you have suggestions @julianmi?

I think this is a somewhat challenging problem. In general, I agree with @benfred's comment that hardcoding it is not a good approach.

I think the two most important properties are available disk space and speed. How about a layered approach:

Environment variable. E.g., CUVS_ACE_BUILD_DIR.

We could read /sys/block on Linux to query for fast disks (NVMe > SSD > HDD) and check for free size.

Fall back to /tmp or a generated temporary path like @benfred suggested.

I agree with a layered approach using only (1) and (3) which are both agreed on by the user to write to -- I don't think we can just write to a drive without user consent.

I agree that (2) is problematic. But (3) has similar problems, right? /tmp could be very small or on a slow disk.

Another approach I see would be to fail with a helpful message that a disk will be used given the graph size and the user should provide a suitable directory.

In general, I am not sure if environment variables are a good approach since the project does not use them a lot. @tfeher, @cjnolet What do you think?

mfoerste4

I did not go over all memory estimates in detail but suggest to align predictions with real data.

Is autotuning of ACE params part of a different PR? Besides the open question on the file location we might want to at least set the number of partitions dynamically.

mfoerste4 · 2026-01-22T20:31:45Z

cpp/bench/ann/src/cuvs/cuvs_cagra_hnswlib_wrapper.h

-      raft::make_host_matrix_view<const T, int64_t>(dataset, nrow, this->dim_));
-  }
-
+  auto dataset_view = raft::make_host_matrix_view<const T, int64_t>(dataset, nrow, this->dim_);


Is the data expected to always reside in host memory?

ACE only supports host memory right now. The main reasons is that we expect the data size to be large and memory-mapped. Further, we do the partitioning and reordering on the host since there is no benefit of moving it to the GPU only to write it to disk afterwards.

Anyways, I think we can support device datasets easily since these should not end up using ACE with this heuristic. @tfeher What do you think?

mfoerste4 · 2026-01-22T20:35:53Z

cpp/include/cuvs/neighbors/cagra.hpp

+namespace helpers {
+/** Calculates the workspace for graph optimization
+ *
+ * @param[in] n_rows number of rows in the dataset (or number of points in the grapt)


Suggested change

* @param[in] n_rows number of rows in the dataset (or number of points in the grapt)

* @param[in] n_rows number of rows in the dataset (or number of points in the graph)

Also mst_optimize is not documented

mfoerste4 · 2026-01-22T21:09:08Z

examples/cpp/src/hnsw_openai_example.cu

+  // ACE build and search example.
+  cagra_build_search_ace(res);


maybe we want to rename this to something generic now that the selection is hidden from the user.

mfoerste4 · 2026-01-22T21:12:34Z

examples/cpp/src/hnsw_openai_example.cu

+  int64_t topk = 12;
+
+  // HNSW index parameters
+  hnsw::index_params params;


I agree with a layered approach using only (1) and (3) which are both agreed on by the user to write to -- I don't think we can just write to a drive without user consent.

mfoerste4 · 2026-01-22T21:16:59Z

cpp/src/neighbors/detail/hnsw.hpp

+    // Configure ACE parameters for CAGRA
+    cuvs::neighbors::cagra::graph_build_params::ace_params cagra_ace_params;
+    cagra_ace_params.npartitions        = ace_params.npartitions;
+    cagra_ace_params.ef_construction    = params.ef_construction;
+    cagra_ace_params.build_dir          = ace_params.build_dir;
+    cagra_ace_params.use_disk           = ace_params.use_disk;
+    cagra_ace_params.max_host_memory_gb = ace_params.max_host_memory_gb;
+    cagra_ace_params.max_gpu_memory_gb  = ace_params.max_gpu_memory_gb;
+    cagra_params.graph_build_params     = cagra_ace_params;


Are you planning to add a heuristic for npartitions depending on dimensions here as well?

We have added heuristics in #1603.

julianmi

I did not get a chance to fully review the memory heuristics yet. I wonder how we can test it though. Should max_host_memory_gb and max_gpu_memory_gb be optional HNSW parameters that we could use to test that the expected algorithm is used based on memory limits set?

julianmi · 2026-01-23T16:34:25Z

cpp/src/neighbors/ivf_pq_index.cu

+  constexpr static uint32_t kIndexGroupSize   = 32;
+  constexpr static uint32_t kIndexGroupVecLen = 16;
+
+  std::cout << "pq_dim " << params.pq_dim << ", pq_bits " << params.pq_bits << ", n_lists"


Is there a specific reason not to use RAFT_LOG_INFO here and in the following lines?

julianmi · 2026-01-23T16:34:34Z

cpp/src/neighbors/detail/hnsw.hpp

  }
 }

+inline std::pair<size_t, size_t> get_available_memory(


Should this helper be placed in cuvs::util:: like get_free_host_memory?

julianmi · 2026-01-23T16:36:28Z

cpp/src/neighbors/detail/cagra/cagra_build.cuh

  }();

-  RAFT_LOG_DEBUG("# Building IVF-PQ index %s", model_name.c_str());
+  RAFT_LOG_INFO("# Building IVF-PQ index %s", model_name.c_str());


Was this and the following logging changes intentionally? Logging every 10 seconds might write a lot of output on a large run.

julianmi · 2026-01-23T16:37:25Z

cpp/src/neighbors/detail/cagra/cagra_helpers.cpp

+  size_t total_dev =
+    std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9;  // addet 1 GB extra workspace size
+
+  std::cout << "IVF-PQ build memory requirements\ndataset_gpu " << dataset_gpu_mem / 1e9 << " GB"


Similar comment about using RAFT_LOG_INFO.

julianmi · 2026-01-23T16:39:57Z

cpp/src/neighbors/detail/cagra/cagra_helpers.cpp

+  size_t total_host =
+    graph_host_mem + host_workspace_size + 2e9;  // added 2 GB extra workspace (IVF-PQ search)
+  size_t total_dev =
+    std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9;  // addet 1 GB extra workspace size


Suggested change

std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9; // addet 1 GB extra workspace size

std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9; // added 1 GB extra workspace size

julianmi · 2026-01-23T16:42:04Z

cpp/include/cuvs/neighbors/ivf_pq.hpp

+ *
+ * @param[in] res raft resource
+ * @param[in] dataset shape of the dataset
+ * @param[in] param ivf-pq compression pramas


Suggested change

* @param[in] param ivf-pq compression pramas

* @param[in] param ivf-pq compression params

tfeher requested review from a team as code owners January 21, 2026 16:20

github-project-automation bot moved this to Todo in Vector Search, ML, & Data Mining Release Board Jan 21, 2026

github-project-automation bot added this to Vector Search, ML, & Data Mining Release Board Jan 21, 2026

Auto select cagra build algo during HNSW build

23a0b16

tfeher force-pushed the auto_selec_cagra_build branch from bb78635 to 23a0b16 Compare January 21, 2026 17:43

tfeher removed request for a team January 21, 2026 17:46

tfeher added breaking Introduces a breaking change improvement Improves an existing functionality labels Jan 21, 2026

tfeher commented Jan 21, 2026

View reviewed changes

tfeher requested a review from mfoerste4 January 21, 2026 17:53

tfeher commented Jan 21, 2026

View reviewed changes

mfoerste4 reviewed Jan 22, 2026

View reviewed changes

julianmi reviewed Jan 23, 2026

View reviewed changes

	* @param[in] n_rows number of rows in the dataset (or number of points in the grapt)
	* @param[in] n_rows number of rows in the dataset (or number of points in the graph)

		// ACE build and search example.
		cagra_build_search_ace(res);

	std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9; // addet 1 GB extra workspace size
	std::max(dataset_gpu_mem, gpu_workspace_size) + 1e9; // added 1 GB extra workspace size

	* @param[in] param ivf-pq compression pramas
	* @param[in] param ivf-pq compression params

Auto select CAGRA build algorithom for hnsw::build #1719

Are you sure you want to change the base?

Auto select CAGRA build algorithom for hnsw::build #1719

Uh oh!

Conversation

tfeher commented Jan 21, 2026

Uh oh!

copy-pr-bot bot commented Jan 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mfoerste4 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

julianmi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants