Skip to content

Conversation

@alinah-amd
Copy link

@alinah-amd alinah-amd commented Sep 23, 2025

Describe your changes

Overview

Added the EstimateNPULatency pass under olive/onnx/vitis_ai/estimate_latency.py
EstimateNPULatency makes use of the NPU Perf Estimator tool to predict computational performance of workloads given a set of parameters.

This is an analysis pass and does not transform the graph at all. Used for performance analysis only.

Installation

To install (if not installed through requirements.txt), run the following:
pip install [placeholder for wheel]

Confirm python version installed is >= 3.10 for compatibility.

If perf estimator package is not installed, the following warning will show and the pass will simply be bypassed:
c5adac46-8376-49c1-a9c0-f78d12d2af9d

Usage

Inputs
EstimateNPULatency takes in both the model in the form of an OnnxModelHandler object and a list of optional parameters in the form of a dict of PassConfigParams (consistent with all other passes).

Optional Parameters
To pass in optional parameters, list parameter name and parameter value as key-value pairs in the json file. See example:

a3d72e0e-0717-4f5d-a590-31333b6a81bb
  1. target_device: Target device type. This is used to provide default config specific to that device type. Currently, only Strix is supported but will modify to include other devices in future.
  • Type: str
  • Default: stx
  • Allowed Values: ["stx"]

Adding Pass to Config File
Should ideally be run as the last pass and listed last in the <model>.json file. For example:

4b0dce1b-a906-46c4-9528-8f94b4a0876e

Output
Generates a concise_summary directory within the run directory with the following files:

557c58ea-7593-429b-9abf-d73ab9d58069

{model_name}_concise_summary.txt will display the following info on roofline latency, total compute ops, and what conclusion can be drawn on performance bottleneck (whether it is DDR Bandwidth bound or Compute bound):

1b2d13c4-4b2a-4775-929b-f429c27e1757

{model_name}_concise_summary.csv will display the same info but specific to per op. Ops are listed in descending order of latency:

528e3263-39e9-4a71-8db8-545325ff6f05

Known Passing Tests

Resnet w/ Perf Estimator
Refer to Olive Recipes Repo

MobileNet w/ Perf Estimator
Refer to Olive Recipes Repo

Unit Test

  • Run python -m pytest test/unit_test/passes/vitis_ai/test_estimate_latency.py
  • This unit test runs a dummy model with the estimate latency pass (calls it directly instead of end to end flow) and asserts that a concise_summary directory with dummy csv and txt files are generated

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

@alinah-amd
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

@microsoft-github-policy-service agree company="AMD"

@jambayk
Copy link
Contributor

jambayk commented Oct 1, 2025

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

optuna
pandas
peft
perf-estimator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no package called perf-estimator on pypi

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are working to push the package into pypi. Will update once that is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants