Skip to content

gh-148285: Allow recording uops after specializing uops#148373

Open
adityakrmishra wants to merge 4 commits intopython:mainfrom
adityakrmishra:fix-uop-recording-v2
Open

gh-148285: Allow recording uops after specializing uops#148373
adityakrmishra wants to merge 4 commits intopython:mainfrom
adityakrmishra:fix-uop-recording-v2

Conversation

@adityakrmishra
Copy link
Copy Markdown

@adityakrmishra adityakrmishra commented Apr 11, 2026

The Issue:
Currently, analyzer.py forces any uop with records_value == True to be at index 0. This prevents Tier 2 recording uops from safely trailing Tier 1 specializing uops.

The Fix (V2):
Following feedback from @Sacul0457 on the previous PR, simply checking if the preceding uop was tier == 1 was too permissive (it allowed recording uops to follow any Tier 1 uop, not just specializing ones).

This updated patch uses a strict positional uop_index tracker in add_macro().
A recording uop is now only permitted if:

  1. It is at uop_index == 0 (the very first uop).
  2. OR it is at uop_index == 1 AND the preceding uop's name explicitly starts with _SPECIALIZE_.

CacheEffect (e.g., unused/1) and flush parts do not increment the uop_index, ensuring they remain completely transparent.

This strictly enforces the architecture while allowing structures like:
macro(X) = _SPECIALIZE_X + _RECORD_TOS_TYPE + unused/1 + _X;

@bedevere-app
Copy link
Copy Markdown

bedevere-app bot commented Apr 11, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

# runtime, so this ordering is safe.)
preceding_is_specializing = (
uop_index == 1
and isinstance(parts[-1], Uop)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think parts[-1] is not equal to the previous UOP: When there is an unused/N between the specialized UOP and the recorded UOP, parts[-1] represents the skip rather than the uop.

Comment on lines +1135 to +1139
# Counts only real OpName entries (not CacheEffect/flush) so we
# know the exact position of each concrete uop inside the macro.
# CacheEffect → becomes Skip; flush → becomes Flush.
# Neither increments uop_index because neither is a "real" uop.
uop_index = 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Counts only real OpName entries (not CacheEffect/flush) so we
# know the exact position of each concrete uop inside the macro.
# CacheEffect → becomes Skip; flush → becomes Flush.
# Neither increments uop_index because neither is a "real" uop.
uop_index = 0
prev_uop: Uop | None = None

Comment on lines +1152 to +1172
if uop.properties.records_value:
# A recording uop is legal in exactly two positions:
# 1. It is the very first real uop (uop_index == 0).
# 2. It is at index 1 AND the immediately preceding
# real uop is a specializing uop, identified by
# the "_SPECIALIZE_" name prefix.
# (Specializing uops are Tier-1-only; recording
# uops are Tier-2-only — they are orthogonal at
# runtime, so this ordering is safe.)
preceding_is_specializing = (
uop_index == 1
and isinstance(parts[-1], Uop)
and parts[-1].name.startswith("_SPECIALIZE_")
)
if uop_index != 0 and not preceding_is_specializing:
raise analysis_error(
f"Recording uop {part.name} must be first in macro "
f"or immediately follow a specializing uop",
macro.tokens[0])
parts.append(uop)
first = False
uop_index += 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if uop.properties.records_value:
# A recording uop is legal in exactly two positions:
# 1. It is the very first real uop (uop_index == 0).
# 2. It is at index 1 AND the immediately preceding
# real uop is a specializing uop, identified by
# the "_SPECIALIZE_" name prefix.
# (Specializing uops are Tier-1-only; recording
# uops are Tier-2-only — they are orthogonal at
# runtime, so this ordering is safe.)
preceding_is_specializing = (
uop_index == 1
and isinstance(parts[-1], Uop)
and parts[-1].name.startswith("_SPECIALIZE_")
)
if uop_index != 0 and not preceding_is_specializing:
raise analysis_error(
f"Recording uop {part.name} must be first in macro "
f"or immediately follow a specializing uop",
macro.tokens[0])
parts.append(uop)
first = False
uop_index += 1
if (uop.properties.records_value
and prev_uop is not None
and "specializing" not in prev_uop.annotations):
raise analysis_error(
f"Recording uop {part.name} must be first in macro "
f"or immediately follow a specializing uop",
macro.tokens[0])
parts.append(uop)
prev_uop = uop

@cocolato
Copy link
Copy Markdown
Member

And we need to add some tests for this pr.

@bedevere-app
Copy link
Copy Markdown

bedevere-app bot commented Apr 11, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@adityakrmishra
Copy link
Copy Markdown
Author

@cocolato Thanks for catching that edge case with the cache skips! I've updated the logic to use a prev_uop tracker as you suggested, which correctly ignores CacheEffect and flush.

I also added unit tests in Tools/cases_generator/test_analyzer.py to cover both valid positions and invalid ones (like a recording uop following a non-specializing Tier 1 uop). Finally, I regenerated the C files locally to ensure the CI check passes. Thanks for the guidance

@bedevere-app
Copy link
Copy Markdown

bedevere-app bot commented Apr 11, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@bedevere-app
Copy link
Copy Markdown

bedevere-app bot commented Apr 11, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@adityakrmishra
Copy link
Copy Markdown
Author

@cocolato It looks like regenerating the .c.h files locally on my Windows machine caused some CRLF line-ending or JIT build failures in the CI. I'll leave the C-file regeneration to the automated bots/maintainers from here to avoid messing up the Windows CI runners!

@Fidget-Spinner
Copy link
Copy Markdown
Member

@adityakrmishra there's no need to regen. Also please fix the CLRF stuff on your local machine. You can fix it using pre-commit or git https://stackoverflow.com/questions/2517190/how-do-i-force-git-to-use-lf-instead-of-crlf-under-windows

@bedevere-app
Copy link
Copy Markdown

bedevere-app bot commented Apr 11, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@adityakrmishra
Copy link
Copy Markdown
Author

@Fidget-Spinner Ah, understood! I've updated my global core.autocrlf setting to prevent Windows from messing with the line endings in the future. I also reverted the two generated C files back to the upstream/main state so they are pristine. Thanks for the tip!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow recording uops after specializing uops

3 participants