Fix hf_ptq mllama path not honoring --calib_seq flag#1329
Fix hf_ptq mllama path not honoring --calib_seq flag#1329
Conversation
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe mllama calibration branch in Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1329 +/- ##
==========================================
+ Coverage 74.67% 76.19% +1.52%
==========================================
Files 468 468
Lines 50374 50374
==========================================
+ Hits 37615 38383 +768
+ Misses 12759 11991 -768
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Duplicate of #1311 |
The mllama branch of make_calib_dataloader in hf_ptq.py was not passing max_length=args.calib_seq to get_vlm_dataset_dataloader, so mllama calibration silently ignored the --calib_seq flag. This matches the companion fix in #1311 for the LLM path. Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
ef33b9c to
48489e8
Compare
|
What does this PR do?
Type of change: Bug fix
examples/llm_ptq/hf_ptq.pyaccepts a--calib_seqargument to control the max calibration sample length, but themllamabranch ofmake_calib_dataloadernever forwarded it toget_vlm_dataset_dataloader. As a result, mllama VLM calibration silently used the defaultmax_lengthregardless of what the user passed on the command line.This complements #1311, which landed the analogous fix for the LLM (
get_dataset_dataloader) branch. After this PR, every branch ofmake_calib_dataloaderthat has a length concept honors--calib_seq:calib_seqflowsspecdec_offline_datasetEagleOfflineDataCollator(train_len=args.calib_seq)(already correct)calib_with_images(VLM)get_vlm_dataset_dataloader(..., max_length=args.calib_seq)(already correct)mllama(VLM)get_vlm_dataset_dataloader(..., max_length=args.calib_seq)← this PRwhisper(speech)get_speech_dataset_dataloaderhas no length parameter; audio is fixed-shape 80×3000 mel featuresget_dataset_dataloader(..., max_sample_length=args.calib_seq)(fixed by #1311)Usage
python examples/llm_ptq/hf_ptq.py \ --pyt_ckpt_path <mllama-model> \ --qformat fp8 \ --calib_seq 2048Testing
Validated on Qwen3-8B using a two-part script:
get_dataset_dataloaderwithmax_sample_length ∈ {64, 128, 512}and confirmed every batch hadinput_ids.shape[1] ≤ max_sample_length.hf_ptq.pyplumbing: invokedmake_calib_dataloaderwithargs.calib_seq ∈ {128, 1024}and a mocked inner dataloader — confirmed the spy receivedmax_sample_length=args.calib_seq(LLM path) andmax_length=args.calib_seq(VLM path) verbatim.Before your PR is "Ready for review"
CONTRIBUTING.md: N/AAdditional Information
Single-line fix in
examples/llm_ptq/hf_ptq.py::make_calib_dataloader(mllama branch). Builds on #1311.Summary by CodeRabbit