Add: suppot trt-rtx-abi ep by haoxiz-nvidia · Pull Request #1783 · NVIDIA/Model-Optimizer

haoxiz-nvidia · 2026-06-22T04:48:14Z

What does this PR do?

Add support for onnx quantization to support trt-rtx-abi ep

Usage

python -m modelopt.onnx.quantization --onnx_path="path\to\model.onnx" --quantize_mode=int8 --output_path="path\to\output\model.onnx" --calibration_eps=NvTensorRtRtx-abi --use_external_data_format --high_precision_dtype=fp32 --abi_ep_path="path\to\trt-rtx-abi-dll"

Testing

Tested on 4 popular llm models on all popular quantization method(int4, fp8, int8)

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
Did you write any new necessary tests?: ❌
Did you update Changelog?: ❌
Did you get Claude approval on this PR?: N/A

Summary by CodeRabbit

New Features
- Added support for NvTensorRtRtx-abi execution provider in ONNX quantization workflows
- Introduced --abi_ep_path command-line argument to specify the ABI execution provider library path
- Extended input shape profile generation to support both NvTensorRtRtx and NvTensorRtRtx-abi execution provider variants

Signed-off-by: haoxiz <haoxiz@nvidia.com>

coderabbitai · 2026-06-22T04:48:27Z

📝 Walkthrough

Walkthrough

The PR extends ONNX PTQ calibration to support a new NvTensorRtRtx-abi execution provider. A new register_abi_ep function validates and registers an external shared library via ONNX Runtime's register_execution_provider_library. The quantize() API and both CLI entrypoints (__main__.py and the Windows example script) are updated to accept and thread through an --abi_ep_path argument.

Changes

NvTensorRtRtx-abi Execution Provider Support

Layer / File(s)	Summary
ABI EP registration and provider list selection `modelopt/onnx/quantization/ort_utils.py`, `modelopt/onnx/quantization/quantize.py`	Adds `_check_for_nv_tensorrt_rtx_abi_libs` and `register_abi_ep` to validate and register the ABI EP shared library. Extends `_prepare_ep_list` with an `elif` branch for `NvTensorRtRtx-abi`. Updates `quantize()` imports, signature (`abi_ep_path: str
CLI and example script wiring `modelopt/onnx/quantization/__main__.py`, `examples/windows/onnx_ptq/genai_llm/quantize.py`	Adds `--abi_ep_path` argument and updates `--calibration_eps` help text in `__main__.py`, forwarding the value into `quantize()`. The Windows example script receives the same treatment: imports `register_abi_ep`, broadens `parse_calibration_eps` and input-shape-profile conditions to include `NvTensorRtRtx-abi`, adds ABI EP registration in `main()`, and adds the `--abi_ep_path` CLI argument.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

Ignore

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Security Anti-Patterns	❌ Error	The code loads user-supplied `abi_ep_path` via `ort.register_execution_provider_library()` at line 324 of ort_utils.py without an inline comment explaining why the file is safe. Per SECURITY.md, se...	Add inline comment to line 324 explaining why it's safe to load the user-supplied DLL path, or request `@NVIDIA/modelopt-setup-codeowners` review with explicit justification in the PR description.
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title contains a typo ('suppot' instead of 'support') and is extremely vague; it doesn't clearly convey that this PR adds TensorRT RTX ABI execution provider support for ONNX quantization.	Correct the typo and provide a more specific title like 'Add TensorRT RTX ABI execution provider support for ONNX quantization' to clearly communicate the main change.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch haoxiz/onnx-ptq-trt-abi

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-22T04:51:47Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1783/
Built to branch `gh-pages` at 2026-06-22 04:51 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modelopt/onnx/quantization/__main__.py`:
- Around line 119-120: The help text string for the calibration_eps parameter
contains a malformed token where `dml:x` is missing quotes while all other
device options like `'trt'`, `'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and
`'NvTensorRtRtx-abi'` are quoted. Add quotes around `dml:x` to make it `'dml:x'`
to maintain consistent formatting in the user-facing help message.

In `@modelopt/onnx/quantization/ort_utils.py`:
- Around line 355-357: Replace the substring matching condition in the elif
statement that checks for NvTensorRtRtx-abi with an exact equality comparison.
Change the condition from using the 'in' operator to using the equality operator
(==) to ensure that only the exact provider string "NvTensorRtRtx-abi" matches,
not variations or malformed values like "NvTensorRtRtx-abi:0". This ensures
consistency with how the ABI registration is triggered in the quantize()
function through exact membership checking.
- Around line 317-325: The _check_for_nv_tensorrt_rtx_abi_libs function accepts
a user-supplied ep_path parameter and loads it as a shared library via
ort.register_execution_provider_library without establishing a trust boundary.
Add either explicit trust validation (such as signature verification) or an
inline comment documenting why the library path is safe (for example, confirming
it is internally-generated and not directly user-supplied), then request
security team review as indicated in SECURITY.md for dynamic component loading
from user-provided artifact paths.

In `@modelopt/onnx/quantization/quantize.py`:
- Around line 495-497: The abi_ep_path parameter documentation references
NvTensorRtRtx-abi as a valid value in calibration_eps, but the calibration_eps
parameter documentation in the same docstring does not list NvTensorRtRtx-abi in
its allowed values. Update the calibration_eps documentation to include
NvTensorRtRtx-abi in the list of supported execution provider values to maintain
consistency with the abi_ep_path documentation.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ebba326b-603d-4522-9b99-7b7ce1107279

📥 Commits

Reviewing files that changed from the base of the PR and between 9048d13 and c5208da.

📒 Files selected for processing (4)

examples/windows/onnx_ptq/genai_llm/quantize.py
modelopt/onnx/quantization/__main__.py
modelopt/onnx/quantization/ort_utils.py
modelopt/onnx/quantization/quantize.py

coderabbitai · 2026-06-22T04:55:11Z

+            "Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', "
+            "'NvTensorRtRtx-abi'], where 'x' is the device id."


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix malformed --calibration_eps help text token.

Line 119 shows dml:x without quotes while all other choices are quoted. This looks like a typo in the user-facing guidance.

Suggested fix

- "Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', " + "Any subset of ['trt', 'cuda:x', 'dml:x', 'cpu', 'NvTensorRtRtx', "

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/onnx/quantization/__main__.py` around lines 119 - 120, The help text string for the calibration_eps parameter contains a malformed token where `dml:x` is missing quotes while all other device options like `'trt'`, `'cuda:x'`, `'cpu'`, `'NvTensorRtRtx'`, and `'NvTensorRtRtx-abi'` are quoted. Add quotes around `dml:x` to make it `'dml:x'` to maintain consistent formatting in the user-facing help message.

coderabbitai · 2026-06-22T04:55:11Z

+def _check_for_nv_tensorrt_rtx_abi_libs(ep_path: str):
+    logger.info("Checking for NvTensorRtRtx ABI EP library")
+    if not ep_path:
+        raise FileNotFoundError("Need to provide abi_ep_path to use NvTensorRtRtx-abi")
+    if not os.path.isfile(ep_path):
+        raise FileNotFoundError(f"NvTensorRtRtx ABI EP library not found: {ep_path}")
+
+    ort.register_execution_provider_library("NvTensorRTRTXExecutionProvider", ep_path)
+    return True


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and the exact code at the specified lines cd modelopt && cat -n onnx/quantization/ort_utils.py | sed -n '310,330p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1205

🏁 Script executed:

# Check how ep_path is passed and where _check_for_nv_tensorrt_rtx_abi_libs is called rg "_check_for_nv_tensorrt_rtx_abi_libs" -B 3 -A 3 modelopt/

Repository: NVIDIA/Model-Optimizer

Length of output: 1109

🏁 Script executed:

# Check the function signatures that call this function to understand the trust boundary rg "abi_ep_path" -B 5 -A 5 modelopt/ | head -100

Repository: NVIDIA/Model-Optimizer

Length of output: 6303

🏁 Script executed:

# Check if there's any documentation or config validation for abi_ep_path rg "register_execution_provider_library" -B 5 -A 5 modelopt/

Repository: NVIDIA/Model-Optimizer

Length of output: 965

Add explicit trust boundary for ABI EP library loading.

Line 324 loads a user-supplied shared library path into the process via register_execution_provider_library. As documented in SECURITY.md, user-provided artifact paths (including the external abi_ep_path DLL) must be treated as untrusted. Security-sensitive dynamic component loading requires an explicit trust contract—either a trusted-source gate with signature verification or an inline comment documenting why the path is safe (e.g., confirming it is internally-generated and not user-supplied)—plus security codeowners review.

Currently the code only validates file existence, with no trust boundary or safety justification. This creates an RCE vector if a malicious DLL is placed at the user-supplied path.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/onnx/quantization/ort_utils.py` around lines 317 - 325, The _check_for_nv_tensorrt_rtx_abi_libs function accepts a user-supplied ep_path parameter and loads it as a shared library via ort.register_execution_provider_library without establishing a trust boundary. Add either explicit trust validation (such as signature verification) or an inline comment documenting why the library path is safe (for example, confirming it is internally-generated and not directly user-supplied), then request security team review as indicated in SECURITY.md for dynamic component loading from user-provided artifact paths.

Source: Coding guidelines

coderabbitai · 2026-06-22T04:55:11Z

+        elif "NvTensorRtRtx-abi" in ep:
+            providers.append("NvTensorRTRTXExecutionProvider")
+            logger.debug("Added NvTensorRtRtx ABI EP")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use exact matching for NvTensorRtRtx-abi EP selection.

Using substring matching here accepts malformed values (for example, NvTensorRtRtx-abi:0), but ABI registration is triggered via exact membership in quantize(). That creates a mismatch where provider selection can proceed without registration and fail later at runtime.

Suggested fix

- elif "NvTensorRtRtx-abi" in ep: + elif ep == "NvTensorRtRtx-abi": providers.append("NvTensorRTRTXExecutionProvider") logger.debug("Added NvTensorRtRtx ABI EP") - elif "NvTensorRtRtx" in ep: + elif ep == "NvTensorRtRtx":

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

elif "NvTensorRtRtx-abi" in ep:

providers.append("NvTensorRTRTXExecutionProvider")

logger.debug("Added NvTensorRtRtx ABI EP")

elif ep == "NvTensorRtRtx-abi":

providers.append("NvTensorRTRTXExecutionProvider")

logger.debug("Added NvTensorRtRtx ABI EP")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/onnx/quantization/ort_utils.py` around lines 355 - 357, Replace the substring matching condition in the elif statement that checks for NvTensorRtRtx-abi with an exact equality comparison. Change the condition from using the 'in' operator to using the equality operator (==) to ensure that only the exact provider string "NvTensorRtRtx-abi" matches, not variations or malformed values like "NvTensorRtRtx-abi:0". This ensures consistency with how the ABI registration is triggered in the quantize() function through exact membership checking.

coderabbitai · 2026-06-22T04:55:11Z

+        abi_ep_path:
+            Path to an external NvTensorRtRtx ABI execution-provider library. Required when
+            ``NvTensorRtRtx-abi`` is present in ``calibration_eps``.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep calibration_eps documentation aligned with the new ABI EP option.

The new abi_ep_path docs are added, but the allowed calibration_eps list in this same docstring still omits NvTensorRtRtx-abi. That makes the public API docs internally inconsistent.

As per coding guidelines, public APIs should be clearly documented and kept accurate.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modelopt/onnx/quantization/quantize.py` around lines 495 - 497, The abi_ep_path parameter documentation references NvTensorRtRtx-abi as a valid value in calibration_eps, but the calibration_eps parameter documentation in the same docstring does not list NvTensorRtRtx-abi in its allowed values. Update the calibration_eps documentation to include NvTensorRtRtx-abi in the list of supported execution provider values to maintain consistency with the abi_ep_path documentation.

Source: Coding guidelines

codecov · 2026-06-22T04:57:47Z

Codecov Report

❌ Patch coverage is 33.33333% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.71%. Comparing base (cfc823d) to head (c5208da).
⚠️ Report is 45 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/onnx/quantization/ort_utils.py	21.42%	11 Missing ⚠️
modelopt/onnx/quantization/quantize.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1783      +/-   ##
==========================================
- Coverage   77.09%   75.71%   -1.39%     
==========================================
  Files         511      511              
  Lines       56168    58257    +2089     
==========================================
+ Hits        43302    44108     +806     
- Misses      12866    14149    +1283

Flag	Coverage Δ
examples	`41.82% <16.66%> (-0.14%)`	⬇️
gpu	`57.68% <27.77%> (-0.63%)`	⬇️
unit	`54.42% <33.33%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add: suppot trt-rtx-abi ep

c5208da

Signed-off-by: haoxiz <haoxiz@nvidia.com>

haoxiz-nvidia requested a review from vishalpandya1990 June 22, 2026 04:48

haoxiz-nvidia self-assigned this Jun 22, 2026

haoxiz-nvidia requested review from a team as code owners June 22, 2026 04:48

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add: suppot trt-rtx-abi ep#1783

Add: suppot trt-rtx-abi ep#1783
haoxiz-nvidia wants to merge 1 commit into
mainfrom
haoxiz/onnx-ptq-trt-abi

haoxiz-nvidia commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

Uh oh!

github-actions Bot commented Jun 22, 2026

Built to branch `gh-pages` at 2026-06-22 04:51 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

codecov Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"Any subset of ['trt', 'cuda:x', dml:x, 'cpu', 'NvTensorRtRtx', "
		"'NvTensorRtRtx-abi'], where 'x' is the device id."

Conversation

haoxiz-nvidia commented Jun 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Uh oh!

github-actions Bot commented Jun 22, 2026

Built to branch gh-pages at 2026-06-22 04:51 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

haoxiz-nvidia commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-06-22 04:51 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov Bot commented Jun 22, 2026 •

edited

Loading