feat(extensions): neutral behavior vocabulary with per-agent translation, fix LLM invocability#2103
feat(extensions): neutral behavior vocabulary with per-agent translation, fix LLM invocability#2103mbachorik wants to merge 18 commits intogithub:mainfrom
Conversation
…md files Extension command bodies reference files using paths relative to the extension root (e.g. `agents/control/commander.md`, `knowledge-base/scores.yaml`). After install these files live at `.specify/extensions/<id>/...`, but the generated SKILL.md files were emitting bare relative paths that AI agents could not resolve from the workspace root. Add `CommandRegistrar.rewrite_extension_paths()` which discovers the subdirectories that exist in the installed extension directory and rewrites matching body references to `.specify/extensions/<id>/<subdir>/...`. The rewrite runs before `resolve_skill_placeholders()` so that extension-local `scripts/` and `templates/` subdirectories are not incorrectly redirected to the project-level `.specify/scripts/` and `.specify/templates/` paths. The method is called from `render_skill_command()` when `source_dir` is provided, which `register_commands()` now passes through for all agents. Affected agents: any using the `/SKILL.md` extension format (currently kimi and codex). Aliases receive the same rewriting. Closes github#2101 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes extension command bodies that reference extension-root-relative files (e.g., agents/..., knowledge-base/...) by rewriting those references to the installed location under .specify/extensions/<id>/... when generating SKILL.md files for skill-format agents (notably codex and kimi).
Changes:
- Add
CommandRegistrar.rewrite_extension_paths()and invoke it during SKILL.md rendering (before placeholder resolution). - Thread
source_dirthrough SKILL.md generation for primary commands and aliases so the renderer has enough context to rewrite paths. - Add unit tests (extensions) and an opt-in integration test (real extension install) to validate end-to-end behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/specify_cli/agents.py |
Adds extension-relative path rewriting and applies it during SKILL.md rendering; threads source_dir into SKILL.md generation paths. |
tests/test_extensions.py |
Adds unit tests covering extension path rewriting for codex/kimi, aliases, and conservative “no subdirs” behavior. |
tests/test_integration_extension_skill_paths.py |
Adds opt-in integration tests that install a real extension and validate rewritten paths across generated skill files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Fixed the path-rewriting scope issue raised in the inline review. Change: Gated Test added: |
There was a problem hiding this comment.
Pull request overview
This PR fixes extension command SKILL.md generation so that references to extension-relative files (e.g., agents/..., templates/...) are rewritten to their installed location under .specify/extensions/<id>/..., making those files resolvable by skill-format agents (Codex/Kimi). It addresses issue #2101 where generated SKILL.md files contained bare relative paths that don’t exist from the workspace root.
Changes:
- Add extension-relative path rewriting during SKILL.md rendering and thread
source_dirthrough skill rendering for primary commands and aliases. - Add unit tests covering rewriting across multiple subdirs, Kimi support, aliases, and “no subdirs” conservative behavior.
- Add an opt-in integration test (env-var gated) that installs a real extension and validates end-to-end SKILL.md outputs.
Show a summary per file
| File | Description |
|---|---|
src/specify_cli/agents.py |
Implements extension-relative path rewriting and applies it during SKILL.md rendering (with source_dir plumbing). |
tests/test_extensions.py |
Adds unit tests ensuring SKILL.md content rewrites extension-relative paths for Codex/Kimi and aliases. |
tests/test_integration_extension_skill_paths.py |
Adds env-gated integration tests to verify rewriting against a real installed extension. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 0 new
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ommands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…during rendering Wire translate_behavior() and strip_behavior_keys() into render_skill_command() so that behavior: and agents: blocks are stripped from output and translated into agent-specific fields (context, model, effort, allowed-tools, etc.) during skill rendering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ead of .claude/skills/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add note below per-agent translation table explaining that allowed-tools is remapped to tools for Claude agent definitions (execution: agent) - Add tools: full and visibility: both rows to the table (both are no-ops) - Add comment in TestEndToEnd explaining why ai_skills is omitted from init-options.json (skill routing uses CommandRegistrar, not ai_skills path) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Just to make sure the non-rendered commands are installed in the .specify/extensions/ directory and then when the extension is enabled / disabled the SKILL.md files are put into the agent specific location using the skill rendering? |
mnriem
left a comment
There was a problem hiding this comment.
Can you clarify if comment above is correct?
|
yup i have the same question is this specifically to support to codex ? my original PR did work for Gemini and AGY when i tested it. |
…eral string in addition to presets
- behavior.py: add `color` key to BEHAVIOR_KEYS; passthrough to Claude agent frontmatter (red|blue|green|yellow|purple|orange|pink|cyan) - behavior.py: add `write` preset (Read Write Edit Grep Glob, no Bash) - behavior.py: add custom tool list support — YAML list or unrecognised string passed verbatim as allowed-tools - agents.py: add `color` to _SKILL_PASSTHROUGH_KEYS["claude"] - tests: add 4 new behavior translator tests (write, color passthrough, all color values, color ignored for non-Claude agents) - RFC: update RFC-EXTENSION-BEHAVIOR-DEPLOYMENT.md to document write preset, custom tool values, and color field with examples Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add table of addenda at the bottom of RFC-EXTENSION-SYSTEM.md and a see-also cross-reference in the conversion appendix, so readers are directed to RFC-EXTENSION-BEHAVIOR-DEPLOYMENT.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… loop _register_extension_skills() now checks behavior.execution before creating a SKILL.md directory. Commands routed to .claude/agents/ (execution: agent) are skipped — they were already deployed by register_commands_for_all_agents(). Manifest-level behavior is merged when the source file has no behavior: block, matching the same merge logic used by the agent deployment path. Also adds a preset regression test: preset source dirs (no extension.yml) must not have paths rewritten to .specify/extensions/... prefixes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All built-in command templates (analyze, checklist, clarify, constitution, implement, plan, specify, tasks, taskstoissues) now declare behavior.invocation: automatic, which translates to disable-model-invocation: false for Claude. This allows the model to invoke these commands autonomously, matching their design intent as workflow commands within an automated pipeline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add .venv312, .specify, .claude, and specs/ to .gitignore to prevent local development artefacts from appearing as untracked files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Content superseded by extensions/RFC-EXTENSION-BEHAVIOR-DEPLOYMENT.md, which is the canonical addendum to RFC-EXTENSION-SYSTEM.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@mnriem @dhilipkumars sorry for radio silence, I did not have time to come back to this. I hit several issues with recent change to spec-kit (originally I thought the fix is minimal to rewrite the relative paths). it has evolved into something bigger, I'll update the PR with proper explanation today or over the weekend. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| _source_fm, _ = registrar.parse_frontmatter( | ||
| source_file.read_text(encoding="utf-8") | ||
| ) | ||
| # Merge manifest-level behavior when source file has none | ||
| if "behavior" not in _source_fm and "behavior" in cmd_info: | ||
| _source_fm["behavior"] = cmd_info["behavior"] | ||
| if get_deployment_type(_source_fm) == "agent": | ||
| continue | ||
|
|
| behavior = frontmatter.get("behavior") or {} | ||
| agents_overrides = frontmatter.get("agents") or {} | ||
| extra_fields = translate_behavior( | ||
| agent_name, behavior, | ||
| agents_overrides if isinstance(agents_overrides, dict) else {} | ||
| ) | ||
| copilot_tools = get_copilot_tools(behavior if isinstance(behavior, dict) else {}) |
| if cmd_type == "agent" and agent_name == "claude": | ||
| output = self.render_agent_definition( | ||
| agent_name, output_name, frontmatter, body, | ||
| source_id, cmd_file, project_root, source_dir=source_dir, | ||
| ) |
| """Neutral behavior vocabulary for extension commands. | ||
|
|
||
| Extension command source files can declare a ``behavior:`` block in their | ||
| frontmatter to express agent-neutral intent (isolation, capability, tools, | ||
| etc.). This module translates that vocabulary to concrete per-agent | ||
| frontmatter fields during rendering. | ||
|
|
||
| Extension authors can also declare an ``agents:`` escape-hatch block for | ||
| agent-specific fields that have no neutral equivalent:: | ||
|
|
||
| behavior: | ||
| execution: isolated |
There was a problem hiding this comment.
Good call — the PR title and description have been updated to reflect the full scope: neutral behavior vocabulary, per-agent translation, LLM invocability fix, agent deployment routing, and the original path rewriting fix.
|
@mbachorik Please address Copilot feedback |
…ude post-processing SkillsIntegration.setup() now runs translate_behavior() on each template's behavior: block and emits the resulting fields (e.g. disable-model-invocation, model) into the SKILL.md frontmatter before ClaudeIntegration.setup() runs its post-processing pass. _inject_frontmatter_flag() already skips keys that are already present, so behavior: invocation: automatic now correctly produces disable-model-invocation: false instead of being overwritten with true. Regression covered by TestSkillsIntegrationBehaviorTranslation (5 tests). Also adds: - TestManifestBehaviorMerge: manifest-level behavior: fields cascade into rendered skills and agent definitions via register_commands() - TestExtensionSkillAgentRoutingSkip: _register_extension_skills() correctly skips execution:agent commands (both source and manifest-declared) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes #2101
Context
When spec-kit commands were moved to the SKILLS format, the SKILL.md renderer hard-coded
disable-model-invocation: truefor every built-in command. This silently broke the ability for LLMs and orchestrating agents (Claude, Copilot, etc.) to invoke spec-kit commands — only humans could trigger them via slash commands.At the same time, extension authors had no portable way to control agent behavior (model, tools, execution mode) without writing agent-specific frontmatter — meaning the same extension couldn't be written once and deployed to Claude Code, GitHub Copilot, and Codex with consistent behavior.
This PR addresses both issues together, as they share the same root: the need for a neutral behavior vocabulary that translates to correct per-agent frontmatter at render time.
What
Fix
disable-model-invocationregression — built-in spec-kit commands now declarebehavior: invocation: automaticin their source frontmatter, which translates todisable-model-invocation: falsein Claude SKILL.md files, restoring LLM invocability.Neutral
behavior:vocabulary — extension authors declare intent once using neutral keys; the renderer translates them to agent-specific frontmatter:invocationexplicit/automaticdisable-model-invocation: true/falseexecutioncommand/isolated/agentcontext: fork(isolated)capabilityfast/balanced/strongmodel: claude-haiku…/claude-sonnet…/claude-opus…effortlow/medium/high/maxthinking:budgettoolsnone/read-only/write/full/ custom listtools: […]visibilityuser/model/bothtype: promptvstype: backgroundcolorred/blue/ …color:(Claude Code UI)Agent-specific escape hatch —
agents: claude: …/agents: copilot: …blocks in source frontmatter pass through agent-specific keys that have no neutral equivalent (e.g.,argument-hint,handoffs).execution: agentrouting — commands withbehavior: execution: agentare deployed to.claude/agents/(Claude sub-agents) and Copilot.agent.mdfiles, with mode and tools injected automatically.Extension-relative path rewriting (original fix from fix: Extension SKILL.md files contain unresolvable relative paths to extension subdirectories #2101) — bare relative paths in extension command bodies (e.g.,
agents/control/commander.md) are rewritten to.specify/extensions/<id>/…in generated SKILL.md files, so AI agents can resolve them from the workspace root.Why it is backwards compatible
behavior:block continue to work exactly as before — the translator is a no-op when the key is absent.disable-model-invocation,context,model, etc.) are passed through unchanged unless abehavior:block also exists.extension.yml; preset directories are skipped.Testing
Tested primarily in GitHub Copilot and Claude Code with a real extension install.
Unit tests added:
tests/test_behavior_translator.py— all neutral → agent-specific translationstests/test_agent_deployment.py—execution: agentrouting to.claude/agents/and.agent.mdtests/test_extension_skills.py— agent-specific passthrough, behavior merging, backwards compattests/test_extensions.py— extension path rewriting (core fix, kimi, aliases, no-subdirs)tests/integrations/test_integration_claude.py— end-to-end Claude integrationOpt-in integration test:
tests/test_integration_extension_skill_paths.py— installs a real extension and validates all generated skill files end-to-end (requiresSPECKIT_TEST_EXT_DIRenv var)🤖 Generated with Claude Code