fix: Remove nested async block causing Stacked Borrows violation in PushDecoderStreamState#21663
Merged
mbutrovich merged 2 commits intoapache:mainfrom Apr 16, 2026
Merged
fix: Remove nested async block causing Stacked Borrows violation in PushDecoderStreamState#21663mbutrovich merged 2 commits intoapache:mainfrom
mbutrovich merged 2 commits intoapache:mainfrom
Conversation
Dandandan
approved these changes
Apr 16, 2026
Contributor
Dandandan
left a comment
There was a problem hiding this comment.
Code looks fine before / after but agree that it would be annoying having to ignore the issue downstream.
martin-g
approved these changes
Apr 16, 2026
This was referenced Apr 16, 2026
Merged
alamb
reviewed
Apr 16, 2026
Contributor
Author
There's actually a followup: #21680 :( |
github-merge-queue bot
pushed a commit
that referenced
this pull request
Apr 16, 2026
## Which issue does this PR close? - Closes #21662. ## Rationale for this change #21663 removed the nested async block in `PushDecoderStreamState::transition` to fix a Stacked Borrows violation under miri. However, miri still flags the same violation because `&mut self` is still a single opaque borrow — inlining the async block alone doesn't let miri split the disjoint field borrows across the `.await` yield point. Comet CI reproduces this reliably: https://github.com/apache/datafusion-comet/actions/runs/24518967597/job/71671004017?pr=3916 ## What changes are included in this PR? Change `PushDecoderStreamState::transition` to take `self` by value instead of `&mut self`. With `&mut self`, the generated future stores a mutable reference, and when `unfold` pins and polls it, miri sees the `&mut self` as a single opaque borrow that conflicts across the `.await` yield point. With owned `self`, the future owns the state directly — no reference means no Stacked Borrows conflict. The struct fields are all heap-allocated or reference-counted, so the move is just pointer-sized copies, not a deep copy. ## Are these changes tested? Existing tests cover this code path. The fix is validated by miri passing in CI (the same test that currently fails: `test_nested_types_extract_missing_struct_names_missing_field`). We'll run Comet CI against this branch first to confirm the miri violation is resolved before merging. ## Are there any user-facing changes? No.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
PushDecoderStreamState::transition#21662.Rationale for this change
Miri detects a Stacked Borrows violation in
PushDecoderStreamState::transition. A nestedasyncblock captures&mut selfas a single opaque mutable reference. At the.awaitonget_byte_ranges, the future yields, and theUniquetag on the borrow stack is invalidated by aSharedReadOnlyretag. When the future resumes,push_rangesattempts a two-phase retag through the now-invalidated tag.This was found by Apache DataFusion Comet, which runs Miri in CI.
What changes are included in this PR?
Remove the nested
asyncblock in theNeedsDataarm ofPushDecoderStreamState::transitionand inline the IO (get_byte_ranges) and CPU (push_ranges) operations as separate statements. Sincetransitionis already anasync fn, the.awaitworks directly in the loop body. Without the nested block, the compiler can split the borrows ofself.readerandself.decoderinto disjoint field borrows, keeping the borrow stack valid across the yield point.Also removes the now-unused
parquet::errors::ParquetErrorimport.Are these changes tested?
Covered by existing parquet reader tests. The original violation was caught by Miri, which DataFusion does not currently run in CI.
Are there any user-facing changes?
No.