Fix hf_ptq mllama path not honoring --calib_seq flag by cjluo-nv · Pull Request #1329 · NVIDIA/Model-Optimizer

cjluo-nv · 2026-04-23T06:05:28Z

What does this PR do?

Type of change: Bug fix

examples/llm_ptq/hf_ptq.py accepts a --calib_seq argument to control the max calibration sample length, but the mllama branch of make_calib_dataloader never forwarded it to get_vlm_dataset_dataloader. As a result, mllama VLM calibration silently used the default max_length regardless of what the user passed on the command line.

This complements #1311, which landed the analogous fix for the LLM (get_dataset_dataloader) branch. After this PR, every branch of make_calib_dataloader that has a length concept honors --calib_seq:

Branch	How `calib_seq` flows
`specdec_offline_dataset`	`EagleOfflineDataCollator(train_len=args.calib_seq)` (already correct)
`calib_with_images` (VLM)	`get_vlm_dataset_dataloader(..., max_length=args.calib_seq)` (already correct)
`mllama` (VLM)	`get_vlm_dataset_dataloader(..., max_length=args.calib_seq)` ← this PR
`whisper` (speech)	N/A — `get_speech_dataset_dataloader` has no length parameter; audio is fixed-shape 80×3000 mel features
default (LLM)	`get_dataset_dataloader(..., max_sample_length=args.calib_seq)` (fixed by #1311)

Usage

python examples/llm_ptq/hf_ptq.py \
    --pyt_ckpt_path <mllama-model> \
    --qformat fp8 \
    --calib_seq 2048

Testing

Validated on Qwen3-8B using a two-part script:

Library contract: directly called get_dataset_dataloader with max_sample_length ∈ {64, 128, 512} and confirmed every batch had input_ids.shape[1] ≤ max_sample_length.
hf_ptq.py plumbing: invoked make_calib_dataloader with args.calib_seq ∈ {128, 1024} and a mocked inner dataloader — confirmed the spy received max_sample_length=args.calib_seq (LLM path) and max_length=args.calib_seq (VLM path) verbatim.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: ❌ (single-line fix to an example script; no unit test harness for this path)
Did you update Changelog?: N/A

Additional Information

Single-line fix in examples/llm_ptq/hf_ptq.py::make_calib_dataloader (mllama branch). Builds on #1311.

Summary by CodeRabbit

Bug Fixes
- Calibration dataloader now correctly respects the configured sequence length during calibration runs, ensuring input sequences are truncated/padded to the specified length. This improves consistency and reliability of calibration and quantization results for applicable model types.

coderabbitai · 2026-04-23T06:05:41Z

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9a8b59a1-524b-458e-beb7-135dacb2be94

📥 Commits

Reviewing files that changed from the base of the PR and between ef33b9c and 48489e8.

📒 Files selected for processing (1)

examples/llm_ptq/hf_ptq.py

📝 Walkthrough

Walkthrough

The mllama calibration branch in examples/llm_ptq/hf_ptq.py now passes max_length=args.calib_seq to get_vlm_dataset_dataloader, explicitly constraining text sequence/input length for the mllama calibration dataloader.

Changes

Cohort / File(s)	Summary
mllama calibration dataloader `examples/llm_ptq/hf_ptq.py`	Added `max_length=args.calib_seq` argument to `get_vlm_dataset_dataloader` call in the mllama branch to enforce calibration sequence length limiting.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately summarizes the main fix: the mllama path in hf_ptq now honors the --calib_seq flag by passing max_length=args.calib_seq to the dataloader.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	The PR adds a benign max_length parameter to pass args.calib_seq, with no security anti-patterns or unsafe patterns introduced.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chenjiel/fix_seq

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-23T06:18:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.19%. Comparing base (e4e3508) to head (48489e8).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1329      +/-   ##
==========================================
+ Coverage   74.67%   76.19%   +1.52%     
==========================================
  Files         468      468              
  Lines       50374    50374              
==========================================
+ Hits        37615    38383     +768     
+ Misses      12759    11991     -768

Flag	Coverage Δ
examples	`41.58% <ø> (+5.97%)`	⬆️
unit	`52.53% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sugunav14

@cjluo-nv Do we also want to update the speech and vlm cases?

kevalmorabia97 · 2026-04-23T10:18:45Z

Duplicate of #1311

The mllama branch of make_calib_dataloader in hf_ptq.py was not passing max_length=args.calib_seq to get_vlm_dataset_dataloader, so mllama calibration silently ignored the --calib_seq flag. This matches the companion fix in #1311 for the LLM path. Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>

github-actions · 2026-04-23T16:10:58Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-23 16:10 UTC

cjluo-nv requested a review from a team as a code owner April 23, 2026 06:05

cjluo-nv requested review from meenchen and sugunav14 April 23, 2026 06:05

cjluo-nv changed the title ~~Fix the bug that hf_ptq not honoring calib_seq flag~~ Fix hf_ptq not honoring --calib_seq flag Apr 23, 2026

sugunav14 reviewed Apr 23, 2026

View reviewed changes

kevalmorabia97 marked this as a duplicate of #1311 Apr 23, 2026

kevalmorabia97 added the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 23, 2026

cjluo-nv force-pushed the chenjiel/fix_seq branch from ef33b9c to 48489e8 Compare April 23, 2026 16:07

cjluo-nv changed the title ~~Fix hf_ptq not honoring --calib_seq flag~~ Fix hf_ptq mllama path not honoring --calib_seq flag Apr 23, 2026

cjluo-nv closed this Apr 23, 2026

kevalmorabia97 removed the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hf_ptq mllama path not honoring --calib_seq flag#1329

Fix hf_ptq mllama path not honoring --calib_seq flag#1329
cjluo-nv wants to merge 1 commit intomainfrom
chenjiel/fix_seq

cjluo-nv commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

codecov Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

sugunav14 left a comment

Uh oh!

kevalmorabia97 commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cjluo-nv commented Apr 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

codecov Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sugunav14 left a comment

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cjluo-nv commented Apr 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

codecov Bot commented Apr 23, 2026 •

edited

Loading