feat: add lora config for dpo dtensor backend #1826

RayenTian · 2026-01-26T02:43:32Z

Result

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

New Features
- Introduced Low-Rank Adaptation (LoRA) configuration options with customizable parameters for model optimization
Tests
- Expanded functional test coverage with new DPO workflow tests including LoRA-based scenarios to ensure reliability

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-26T06:42:36Z

📝 Walkthrough

Walkthrough

This PR adds LoRA configuration options to the DPO example configuration file and introduces new functional GPU tests, including a test script for LoRA-based automodel DPO training with comprehensive test orchestration including metric validation.

Changes

Cohort / File(s)	Summary
LoRA Configuration `examples/configs/dpo.yaml`	Adds new LoRA settings block under `policy.dtensor_cfg` with parameters: enabled, target_modules, exclude_modules, match_all_linear, dim, alpha, dropout, dropout_position, lora_A_init, and use_triton
Test Registration `tests/functional/L1_Functional_Tests_GPU.sh`	Registers two new functional tests: `dpo_automodel_lora.sh` and `dpo_megatron.sh` in the GPU test suite execution sequence
New Test Script `tests/functional/dpo_automodel_lora.sh`	Implements end-to-end DPO LoRA test orchestration: environment setup, runs DPO automodel training with LoRA enabled, captures metrics to JSON, and validates training loss at step 3 is below 0.8

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

feat: Megatron SFT LoRA #1629 — Adds and expands LoRA configuration under policy.dtensor_cfg and introduces LoRA-focused functional tests

Suggested labels

CI:L1

Suggested reviewers

terrykong
yuki-97
joyang-nv

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR introduces major changes but PR description lacks documented test results, convergence validation, or performance information despite adding functional tests.	Update PR description with: (1) Results from running dpo_automodel_lora.sh and dpo_megatron.sh tests including training loss and convergence, (2) Confirmation of no regression in numerics/convergence, (3) Performance measurements. Fix bash syntax error in line 11: change set -eou pipefail to set -euo pipefail.
Title check	❓ Inconclusive	The PR title mentions adding a LoRA config for DPO dtensor backend, which aligns with the primary change (adding LoRA configuration to dpo.yaml), but conflicts with PR objectives that describe SFT (supervised fine-tuning) configuration.	Clarify whether the PR is for LoRA or SFT configuration. The title says 'lora config' but the PR objectives mention 'SFT config.' Update the title and objectives to be consistent.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@tests/functional/dpo_automodel_lora.sh`:
- Line 11: The shell script uses an incorrectly ordered set invocation "set -eou
pipefail" where the -o flag's argument must immediately follow, causing an
invalid option; update the command to "set -euo pipefail" so -e and -u are
enabled and -o pipefail is applied (replace the existing set -eou pipefail
invocation).

tests/functional/dpo_automodel_lora.sh

Signed-off-by: ruit <ruit@nvidia.com>

RayenTian marked this pull request as ready for review January 26, 2026 06:39

RayenTian requested review from a team as code owners January 26, 2026 06:39

RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026

RayenTian requested a review from yuki-97 January 26, 2026 06:39

RayenTian had a problem deploying to nemo-ci January 26, 2026 06:39 — with GitHub Actions Error

RayenTian requested a review from terrykong January 26, 2026 06:39

RayenTian removed the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026

RayenTian force-pushed the ruit/dpo_lora branch from 991d1d1 to 13788ac Compare January 26, 2026 06:41

RayenTian changed the title ~~feat: add sft config for dpo dtensor backend~~ feat: add lora config for dpo dtensor backend Jan 26, 2026

RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026

RayenTian temporarily deployed to nemo-ci January 26, 2026 06:47 — with GitHub Actions Inactive

coderabbitai bot reviewed Jan 26, 2026

View reviewed changes

tests/functional/dpo_automodel_lora.sh Show resolved Hide resolved

yuki-97 previously approved these changes Jan 26, 2026

View reviewed changes

RayenTian temporarily deployed to nemo-ci January 26, 2026 09:02 — with GitHub Actions Inactive

RayenTian had a problem deploying to nemo-ci January 26, 2026 16:23 — with GitHub Actions Failure

RayenTian added 3 commits January 27, 2026 21:59

add sft config for dpo dtensor backend

b3b6b43

Signed-off-by: ruit <ruit@nvidia.com>

add functional test

e255150

Signed-off-by: ruit <ruit@nvidia.com>

remove unused functional test

5294b63

Signed-off-by: ruit <ruit@nvidia.com>

RayenTian dismissed yuki-97’s stale review via 5294b63 January 28, 2026 05:59

RayenTian force-pushed the ruit/dpo_lora branch from 13788ac to 5294b63 Compare January 28, 2026 05:59

RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 28, 2026

RayenTian temporarily deployed to nemo-ci January 28, 2026 06:00 — with GitHub Actions Inactive

RayenTian temporarily deployed to nemo-ci January 28, 2026 06:05 — with GitHub Actions Inactive

yuki-97 approved these changes Jan 28, 2026

View reviewed changes

RayenTian temporarily deployed to nemo-ci January 28, 2026 12:55 — with GitHub Actions Inactive

RayenTian added 2 commits January 29, 2026 10:09

Merge branch 'main' into ruit/dpo_lora

ddb062b

Merge branch 'main' into ruit/dpo_lora

61385fa

RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 29, 2026

RayenTian temporarily deployed to nemo-ci January 29, 2026 03:39 — with GitHub Actions Inactive

RayenTian temporarily deployed to nemo-ci January 29, 2026 05:10 — with GitHub Actions Inactive

RayenTian temporarily deployed to nemo-ci January 29, 2026 08:42 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add lora config for dpo dtensor backend #1826

feat: add lora config for dpo dtensor backend #1826

Uh oh!

RayenTian commented Jan 26, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add lora config for dpo dtensor backend #1826

Are you sure you want to change the base?

feat: add lora config for dpo dtensor backend #1826

Uh oh!

Conversation

RayenTian commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Result

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RayenTian commented Jan 26, 2026 •

edited

Loading

coderabbitai bot commented Jan 26, 2026 •

edited

Loading