Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
b6cd281
Adds a constraints system for validating inputs and guaranteeing prop…
pranavm-nvidia Oct 22, 2025
d91bce7
Moves `wrappers.py` into the frontend, which is the only place it's used
pranavm-nvidia Oct 24, 2025
54cc680
Enables input validation based on input constraints
pranavm-nvidia Oct 25, 2025
f36f755
Improves constraint error messages
pranavm-nvidia Oct 28, 2025
0cf6dcb
Removes `Not` constraint
pranavm-nvidia Oct 28, 2025
b662350
Enables constraints to be documented in a human-readable way
pranavm-nvidia Oct 29, 2025
5576a7d
Updates automatic casting logic to use constraints system
pranavm-nvidia Oct 31, 2025
5b2b7c5
Improves source code retrieval in stack info
pranavm-nvidia Nov 3, 2025
bb8f8a7
Adds a .devcontainer configuration for VS Code
pranavm-nvidia Nov 4, 2025
48c5754
Adds automated operator constraints tests for data type constraints
pranavm-nvidia Nov 5, 2025
b36aadb
Refactors `doc_str`, adds field for additional constraint information
pranavm-nvidia Nov 6, 2025
6d39f6b
Adds a section on documentation philosophy, fixes path in devcontainer
pranavm-nvidia Nov 13, 2025
4934912
Update tripy/docs/README.md
pranavm-nvidia Nov 19, 2025
def6e0e
Ports concatenate to new constraints sytem
pranavm-nvidia Nov 18, 2025
0c73514
Removes non-ASCII character in README
pranavm-nvidia Nov 18, 2025
5d252bc
Adds an If constraint to gate other constraints
pranavm-nvidia Nov 19, 2025
d52222d
Ports ones to new constraint system, updates `merge_function_arguments`
pranavm-nvidia Nov 19, 2025
de02edf
Update tripy/nvtripy/utils/utils.py
pranavm-nvidia Nov 19, 2025
635729b
Migrate unary operators to new constraint system
pranavm-nvidia Nov 19, 2025
a57e59f
Migrate shape manipulation operators to new constraint system
pranavm-nvidia Nov 19, 2025
4a9c8b4
Migrate binary operators to new constraint system
pranavm-nvidia Nov 19, 2025
f1d90d3
Migrate reduction operators to new constraint system
pranavm-nvidia Dec 22, 2025
5fb7414
Migrate initializer operators to new constraint system
pranavm-nvidia Dec 22, 2025
13599c0
Step 6: Migrate type-preserving operators to new constraint system
pranavm-nvidia Dec 24, 2025
575eb1a
Step 7: Migrate operators with complex dtype relationships
pranavm-nvidia Dec 24, 2025
5de83f5
Step 8: Migrate final remaining operators
pranavm-nvidia Dec 24, 2025
dc40af0
Complete migration to new constraint system and remove legacy dtype s…
pranavm-nvidia Jan 9, 2026
33960b4
Removes extra files
pranavm-nvidia Feb 6, 2026
208b992
Removes a hallucinated parameter
pranavm-nvidia Feb 6, 2026
1f693c4
Adds HF token
pranavm-nvidia Feb 6, 2026
00a4fe3
Adds an optimizer for the constraints sysytem, optimizes `merge_funct…
pranavm-nvidia Feb 10, 2026
0a177a6
Addresses various TODOs and fixes tests
pranavm-nvidia Feb 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/tripy-l0.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ env:
REGISTRY: ghcr.io
DEFAULT_IMAGE: ghcr.io/nvidia/tensorrt-incubator/tripy:latest
NEW_TEST_IMAGE: test-image:latest
HF_TOKEN: ${{ secrets.HF_TOKEN }}


jobs:
Expand Down Expand Up @@ -58,7 +59,7 @@ jobs:
uses: addnab/docker-run-action@v3
with:
image: ${{ env.l0_image }}
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy -e HF_TOKEN=${{ env.HF_TOKEN }}
run: |
python3 docs/generate_rsts.py
sphinx-build build/doc_sources build/docs -c docs/ -j 4 -W -n
Expand All @@ -67,15 +68,15 @@ jobs:
uses: addnab/docker-run-action@v3
with:
image: ${{ env.l0_image }}
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy -e HF_TOKEN=${{ env.HF_TOKEN }}
run: |
pytest --cov=nvtripy/ --cov-config=.coveragerc tests/ -v -m "not l1" -n 4 --durations=15 --ignore tests/performance

- name: Run performance benchmarks
uses: addnab/docker-run-action@v3
with:
image: ${{ env.l0_image }}
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy
options: --gpus all -v ${{ github.workspace }}/tripy:/tripy -e HF_TOKEN=${{ env.HF_TOKEN }}
run: |
pytest tests/performance -v -m "not l1" --benchmark-warmup=on --benchmark-json benchmark.json

Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/tripy-l1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ concurrency:
jobs:
l1-test:
runs-on: tripy-self-hosted
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
container:
image: ghcr.io/nvidia/tensorrt-incubator/tripy:latest
volumes:
Expand Down
50 changes: 50 additions & 0 deletions tripy/.devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/python
{
"name": "Tripy",
"build": {
"context": "${localWorkspaceFolder}",
"dockerfile": "${localWorkspaceFolder}/Dockerfile",
"args": {
"username": "${localEnv:USER}"
}
},
"workspaceMount": "source=${localWorkspaceFolder}/..,target=/workspaces/TensorRT-Incubator,type=bind,consistency=cached",
"workspaceFolder": "/workspaces/TensorRT-Incubator/tripy",
"runArgs": [
"--gpus",
"all",
"-it",
"--cap-add=SYS_PTRACE"
],
"mounts": [
"source=${localEnv:HOME}${localEnv:USERPROFILE},target=/home/${localEnv:USER},type=bind,consistency=cached"
],
"remoteEnv": {
"SHELL": "${localEnv:SHELL:/bin/bash}",
"ZSH": "/home/${localEnv:USER}/.oh-my-zsh",
"PYTHONPATH": "/workspaces/TensorRT-Incubator/tripy:${localEnv:PYTHONPATH}",
"PATH": "/usr/local/bin/:${localEnv:PATH}"
},
"remoteUser": "${localEnv:USER}",
"forwardPorts": [
8080
],
"customizations": {
"vscode": {
"extensions": [
"ms-python.black-formatter",
"lextudio.restructuredtext",
"github.vscode-github-actions",
"ms-python.isort",
"ms-toolsai.jupyter",
"ms-toolsai.vscode-jupyter-cell-tags",
"ms-toolsai.jupyter-renderers",
"llvm-vs-code-extensions.vscode-mlir",
"ms-python.python",
"ms-python.vscode-pylance",
"eamodio.gitlens"
]
}
}
}
2 changes: 2 additions & 0 deletions tripy/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ Thanks for your interest in contributing to Tripy!
docker run --gpus all -it --cap-add=SYS_PTRACE -p 8080:8080 -v $(pwd):/tripy/ --rm tripy:latest
```

- If you are using Visual Studio Code, you can alternatively use the included `.devcontainer` configuration.

3. **[Optional]** Run a sanity check in the container:

```bash
Expand Down
13 changes: 7 additions & 6 deletions tripy/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,17 @@ ENTRYPOINT ["/bin/bash"]
# Setup user account
ARG uid=1000
ARG gid=1000
ARG username=trtuser
ENV DEBIAN_FRONTEND=noninteractive

RUN groupadd -r -f -g ${gid} trtuser && \
useradd -o -r -l -u ${uid} -g ${gid} -ms /bin/bash trtuser && \
usermod -aG sudo trtuser && \
echo 'trtuser:nvidia' | chpasswd && \
mkdir -p /workspace && chown trtuser /workspace
RUN groupadd -r -f -g ${gid} ${username} && \
useradd -o -r -l -u ${uid} -g ${gid} -ms /bin/bash ${username} && \
usermod -aG sudo ${username} && \
echo "${username}:nvidia" | chpasswd && \
mkdir -p /workspace && chown ${username} /workspace

RUN apt-get update && \
apt-get install -y sudo python3 python3-pip gdb git wget curl && \
apt-get install -y sudo python3 python3-pip gdb git wget curl zsh && \
apt-get clean && \
python3 -m pip install --upgrade pip

Expand Down
85 changes: 80 additions & 5 deletions tripy/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,17 +45,23 @@ which specifies doc metadata for each API (e.g. location).
- Docstring must include *at least* **one [code example](#code-examples)**.

- If the function accepts `tp.Tensor`s, must indicate **data type constraints**
with the [`wrappers.interface`](../nvtripy/utils/wrappers.py) decorator.
with the [`wrappers.interface`](../nvtripy/frontend/wrappers.py) decorator.

**Example:**

```py
from nvtripy import export
from nvtripy.frontend import wrappers
from nvtripy.common import datatype as dt
from nvtripy.frontend.constraints import GetInput, GetReturn, OneOf

@export.public_api(document_under="operations/functions")
@wrappers.interface(
dtype_constraints={"input": "T1", wrappers.RETURN_VALUE: "T1"},
dtype_variables={
"T1": ["float32", "float16", "bfloat16", "int4", "int32", "int64", "bool", "int8"],
},
input_requirements=OneOf(
GetInput("input").dtype,
[dt.float32, dt.float16, dt.bfloat16, dt.int4, dt.int8, dt.int32, dt.int64, dt.bool],
),
output_guarantees=GetReturn(0).dtype == GetInput("input").dtype,
)
def relu(input: "nvtripy.Tensor") -> "nvtripy.Tensor":
r"""
Expand Down Expand Up @@ -167,3 +173,72 @@ Code blocks in docstrings/guides are **preprocessed**:
- **Include** only specific variables: `# doc: print-locals <var1> <var2> ...`
- **Exclude** *specific* variables: `# doc: no-print-locals <var1> <var2> ...`
- **Exclude** *all* variables: `# doc: no-print-locals` (with no arguments).


## Documentation Philosophy: Write Less Documentation

> "I didn't have time to write a short letter, so I wrote a long one instead." - Mark Twain

How much documentation do you want to read? The answer is **none**!

- **Best Case:** Make docs unnecessary with an intuitive API and clear errors.

This is not always possible; sometimes, we need to write docs.

- **Problem:** We don't think enough about *what* *precisely* we want to convey.

- **Suggestions**: Write discoverable, concise, but complete documentation.

- **Highlight key points** but make it easy to find details.

- Use bullets and **bold** to break up monotony.
Paragraphs are *so* 2024.

- Leverage the medium - pictures, charts, emojis, markup.
We are not using printing presses!

- Forget the rules: use contractions, don't spell out numbers, etc.

- **Tip:** Write like we're paying for every syllable!
If it's too wordy to say, it's too wordy to write.

Writing *concisely* forces you to think about what's **signal** and what's **noise**.

Below are examples from previous versions of Tripy documentation that was improved.

### Example 1

> One important point is that Tripy uses a lazy evaluation model; that is, no computation is performed until a value is actually needed.

* **Tip:** Ask: "What is this really saying?"

> Tensors are evaluated only when they're used.

### Example 2


> ### **Eager Mode: How Does It Work?**
>
> If you've used TensorRT before, you may know that it does not support an eager mode.
> In order to provide eager mode support in Tripy, we actually need to compile the graph under the hood.
>
> Although we employ several tricks to make compile times faster when using eager mode, we do still need to compile,
> and so eager mode will likely be slower than other comparable frameworks.
>
> Consequently, we suggest that you use eager mode primarily for debugging and compiled mode for deployments.

**Problem**: We must sift through filler to find key points.

**Ask**:

- **"What is the ONE most important takeaway?"**
*Eager mode is only for debugging.*

- **"What questions does this raise?" - Why?**
*Tripy always compiles since TensorRT doesn't have eager mode.*

Make the **one key point** stand out so skimmers can spot it:

> **Best Practice:** Use **eager mode** only for **debugging**; compile for deployment.
>
> **Why:** Eager mode internally compiles the graph (slow!) since TensorRT doesn't have eager execution.
2 changes: 1 addition & 1 deletion tripy/docs/post0_developer_guides/00-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ and various operations, e.g. {class}`nvtripy.resize`.
:::{admonition} Info
Most operations are decorated with:
1. [`@export.public_api`](source:/nvtripy/export.py): Enables documentation, type checking, and overloading.
2. [`@wrappers.interface`](source:/nvtripy/utils/wrappers.py): Enforces (and generates tests for) data type constraints.
2. [`@wrappers.interface`](source:/nvtripy/frontend/wrappers.py): Enforces (and generates tests for) data type constraints.
:::

Operations are **lazily evaluated**.
Expand Down
8 changes: 5 additions & 3 deletions tripy/docs/post0_developer_guides/01-how-to-add-new-ops.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,13 +129,15 @@ from typing import Tuple

from nvtripy import export
from nvtripy.trace.ops.topn import TopN
from nvtripy.utils import wrappers
from nvtripy.frontend import wrappers
from nvtripy.frontend.ops import utils as op_utils
from nvtripy.common import datatype as dt
from nvtripy.frontend.constraints import GetInput, GetReturn, OneOf

@export.public_api(document_under="operations/functions")
@wrappers.interface(
dtype_constraints={"input": "T1", wrappers.RETURN_VALUE: ["T1", "T2"]},
dtype_variables={"T1": ["float32", "float16", "bfloat16", "int32", "int64"], "T2": ["int32"]},
input_requirements=OneOf(GetInput("input").dtype, [dt.float32, dt.float16, dt.bfloat16, dt.int32, dt.int64]),
output_guarantees=(GetReturn(0).dtype == GetInput("input").dtype) & (GetReturn(1).dtype == dt.int32),
)
def topn(input: "nvtripy.Tensor", n: int, dim: int) -> Tuple["nvtripy.Tensor", "nvtripy.Tensor"]:
# See docs/README.md for more information on how to write docstrings
Expand Down
4 changes: 2 additions & 2 deletions tripy/docs/pre0_user_guides/02-quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ explains quantization in more detail.
## Post-Training Quantization With ModelOpt

If the model was not trained with quantization-aware training (QAT), we can use
[TensorRT ModelOpt](https://nvidia.github.io/TensorRT-Model-Optimizer/index.html)
[TensorRT ModelOpt](https://nvidia.github.io/Model-Optimizer/index.html)
to do **calibration** to determine scaling factors.

:::{admonition} Info
Expand Down Expand Up @@ -82,7 +82,7 @@ Let's calibrate a GPT model:
```

3. Run calibration to replace linear layers with
[`QuantLinear`](https://nvidia.github.io/TensorRT-Model-Optimizer/reference/generated/modelopt.torch.quantization.nn.modules.quant_linear.html#modelopt.torch.quantization.nn.modules.quant_linear.QuantLinear),
[`QuantLinear`](https://nvidia.github.io/Model-Optimizer/reference/generated/modelopt.torch.quantization.nn.modules.quant_linear.html#modelopt.torch.quantization.nn.modules.quant_linear.QuantLinear),
which contain calibration information:

```py
Expand Down
2 changes: 1 addition & 1 deletion tripy/examples/nanogpt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ This example implements a [NanoGPT model](https://github.com/karpathy/nanoGPT) u
### Running with Quantization

[`quantization.py`](./quantization.py), uses
[NVIDIA TensorRT Model Optimizer](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/1_overview.html)
[NVIDIA TensorRT Model Optimizer](https://nvidia.github.io/Model-Optimizer/getting_started/1_overview.html)
to quantize the pytorch model.

`load_quant_weights_from_hf` in [`weight_loader.py`](./weight_loader.py) converts the quantization
Expand Down
2 changes: 1 addition & 1 deletion tripy/nvtripy/backend/api/compile.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ def process_arg(name, arg):
compiled_arg_names = []

new_args = []
positional_arg_info, variadic_info = utils.utils.get_positional_arg_names(func, *args)
positional_arg_info, variadic_info = utils.utils.get_positional_args_with_names(func, *args)

varargs_name = None
varargs_index = None
Expand Down
8 changes: 4 additions & 4 deletions tripy/nvtripy/config.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand Down Expand Up @@ -45,12 +45,12 @@
)(os.path.join(tempfile.gettempdir(), "tripy-cache"))
"""Path to a timing cache file that can be used to speed up compilation time."""

enable_dtype_checking: bool = export.public_api(
enable_input_validation: bool = export.public_api(
document_under="config.rst",
module=sys.modules[__name__],
symbol="enable_dtype_checking",
symbol="enable_input_validation",
)(True)
"""Whether to enable data type checking in API functions."""
"""Whether to enable input validation in API functions."""

extra_error_information: List[str] = export.public_api(
document_under="config.rst",
Expand Down
31 changes: 31 additions & 0 deletions tripy/nvtripy/frontend/constraints/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from nvtripy.frontend.constraints.base import Constraints
from nvtripy.frontend.constraints.doc_str import doc_str
from nvtripy.frontend.constraints.fetcher import Fetcher, GetDataType, GetInput, GetReturn, ValueFetcher
from nvtripy.frontend.constraints.logic import (
AlwaysFalse,
AlwaysTrue,
And,
Equal,
If,
Logic,
NotEqual,
NotOneOf,
OneOf,
Or,
)
Loading