Skip to content

Fix hf_ptq mllama path not honoring --calib_seq flag#1329

Closed
cjluo-nv wants to merge 1 commit intomainfrom
chenjiel/fix_seq
Closed

Fix hf_ptq mllama path not honoring --calib_seq flag#1329
cjluo-nv wants to merge 1 commit intomainfrom
chenjiel/fix_seq

Conversation

@cjluo-nv
Copy link
Copy Markdown
Collaborator

@cjluo-nv cjluo-nv commented Apr 23, 2026

What does this PR do?

Type of change: Bug fix

examples/llm_ptq/hf_ptq.py accepts a --calib_seq argument to control the max calibration sample length, but the mllama branch of make_calib_dataloader never forwarded it to get_vlm_dataset_dataloader. As a result, mllama VLM calibration silently used the default max_length regardless of what the user passed on the command line.

This complements #1311, which landed the analogous fix for the LLM (get_dataset_dataloader) branch. After this PR, every branch of make_calib_dataloader that has a length concept honors --calib_seq:

Branch How calib_seq flows
specdec_offline_dataset EagleOfflineDataCollator(train_len=args.calib_seq) (already correct)
calib_with_images (VLM) get_vlm_dataset_dataloader(..., max_length=args.calib_seq) (already correct)
mllama (VLM) get_vlm_dataset_dataloader(..., max_length=args.calib_seq)this PR
whisper (speech) N/A — get_speech_dataset_dataloader has no length parameter; audio is fixed-shape 80×3000 mel features
default (LLM) get_dataset_dataloader(..., max_sample_length=args.calib_seq) (fixed by #1311)

Usage

python examples/llm_ptq/hf_ptq.py \
    --pyt_ckpt_path <mllama-model> \
    --qformat fp8 \
    --calib_seq 2048

Testing

Validated on Qwen3-8B using a two-part script:

  1. Library contract: directly called get_dataset_dataloader with max_sample_length ∈ {64, 128, 512} and confirmed every batch had input_ids.shape[1] ≤ max_sample_length.
  2. hf_ptq.py plumbing: invoked make_calib_dataloader with args.calib_seq ∈ {128, 1024} and a mocked inner dataloader — confirmed the spy received max_sample_length=args.calib_seq (LLM path) and max_length=args.calib_seq (VLM path) verbatim.

Before your PR is "Ready for review"

  • Is this change backward compatible?: ✅
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: ❌ (single-line fix to an example script; no unit test harness for this path)
  • Did you update Changelog?: N/A

Additional Information

Single-line fix in examples/llm_ptq/hf_ptq.py::make_calib_dataloader (mllama branch). Builds on #1311.

Summary by CodeRabbit

  • Bug Fixes
    • Calibration dataloader now correctly respects the configured sequence length during calibration runs, ensuring input sequences are truncated/padded to the specified length. This improves consistency and reliability of calibration and quantization results for applicable model types.

@cjluo-nv cjluo-nv requested a review from a team as a code owner April 23, 2026 06:05
@cjluo-nv cjluo-nv requested review from meenchen and sugunav14 April 23, 2026 06:05
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9a8b59a1-524b-458e-beb7-135dacb2be94

📥 Commits

Reviewing files that changed from the base of the PR and between ef33b9c and 48489e8.

📒 Files selected for processing (1)
  • examples/llm_ptq/hf_ptq.py

📝 Walkthrough

Walkthrough

The mllama calibration branch in examples/llm_ptq/hf_ptq.py now passes max_length=args.calib_seq to get_vlm_dataset_dataloader, explicitly constraining text sequence/input length for the mllama calibration dataloader.

Changes

Cohort / File(s) Summary
mllama calibration dataloader
examples/llm_ptq/hf_ptq.py
Added max_length=args.calib_seq argument to get_vlm_dataset_dataloader call in the mllama branch to enforce calibration sequence length limiting.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main fix: the mllama path in hf_ptq now honors the --calib_seq flag by passing max_length=args.calib_seq to the dataloader.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed The PR adds a benign max_length parameter to pass args.calib_seq, with no security anti-patterns or unsafe patterns introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chenjiel/fix_seq

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@cjluo-nv cjluo-nv changed the title Fix the bug that hf_ptq not honoring calib_seq flag Fix hf_ptq not honoring --calib_seq flag Apr 23, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.19%. Comparing base (e4e3508) to head (48489e8).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1329      +/-   ##
==========================================
+ Coverage   74.67%   76.19%   +1.52%     
==========================================
  Files         468      468              
  Lines       50374    50374              
==========================================
+ Hits        37615    38383     +768     
+ Misses      12759    11991     -768     
Flag Coverage Δ
examples 41.58% <ø> (+5.97%) ⬆️
unit 52.53% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@sugunav14 sugunav14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjluo-nv Do we also want to update the speech and vlm cases?

@kevalmorabia97
Copy link
Copy Markdown
Collaborator

Duplicate of #1311

@kevalmorabia97 kevalmorabia97 marked this as a duplicate of #1311 Apr 23, 2026
@kevalmorabia97 kevalmorabia97 added the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 23, 2026
The mllama branch of make_calib_dataloader in hf_ptq.py was not passing
max_length=args.calib_seq to get_vlm_dataset_dataloader, so mllama
calibration silently ignored the --calib_seq flag. This matches the
companion fix in #1311 for the LLM path.

Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
@cjluo-nv cjluo-nv changed the title Fix hf_ptq not honoring --calib_seq flag Fix hf_ptq mllama path not honoring --calib_seq flag Apr 23, 2026
@cjluo-nv cjluo-nv closed this Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-23 16:10 UTC

@kevalmorabia97 kevalmorabia97 removed the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants