Skip to content

Add JS Asset Auditor plugin with Playwright CLI#633

Open
ChristianPavilonis wants to merge 14 commits intomainfrom
feature/js-asset-auditor
Open

Add JS Asset Auditor plugin with Playwright CLI#633
ChristianPavilonis wants to merge 14 commits intomainfrom
feature/js-asset-auditor

Conversation

@ChristianPavilonis
Copy link
Copy Markdown
Collaborator

@ChristianPavilonis ChristianPavilonis commented Apr 13, 2026

Summary

  • Add JS Asset Auditor as a Claude Code plugin at packages/js-asset-auditor/ with a standalone Playwright CLI that sweeps publisher pages for third-party JS assets
  • Auto-detect integrations (GPT, GTM, Didomi, DataDome, Lockr, Permutive, Prebid, APS) from swept URLs and generate trusted-server.toml config with --config flag
  • Expanded heuristic filters for Google ad rendering, ad verification, ad fraud detection, and reCAPTCHA to reduce noise
  • Auto-include target URL host as first-party, make trusted-server.toml optional with --domain flag for portability
  • Browser launches headed by default to avoid bot detection (captchas, DataDome, etc.)

Try it out

1. Check out the branch

git fetch origin
git checkout feature/js-asset-auditor

2. Install dependencies

cd packages/js-asset-auditor && npm install && npx playwright install chromium
cd ../..

3. Run the CLI directly

# Basic audit — opens a browser, sweeps the page, writes js-assets.toml
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com

# Also generate trusted-server.toml with detected integrations
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com --config

# Custom output paths
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com \
  --output /tmp/assets.toml --config /tmp/config.toml

# Headless mode (for CI, may be blocked by bot protection)
node packages/js-asset-auditor/lib/audit.mjs https://www.publisher.com --headless

4. Use as a Claude Code plugin

claude --plugin-dir packages/js-asset-auditor

Then in Claude Code:

/js-asset-auditor:audit-js-assets https://www.publisher.com
/js-asset-auditor:audit-js-assets https://www.publisher.com --config

CLI flags

Flag Description
--diff Compare sweep against existing js-assets.toml
--settle <ms> Settle window after page load (default: 6000)
--first-party <hosts> Additional first-party hosts (comma-separated)
--domain <host> Publisher domain (falls back to trusted-server.toml or URL)
--no-filter Bypass heuristic filtering
--headless Run browser without UI (default is headed)
--output <path> Output file for js-assets.toml (default: js-assets.toml)
--config [path] Generate trusted-server.toml with detected integrations
--force Overwrite existing config file

Test plan

  • Run against a publisher page and verify js-assets.toml output
  • Run with --config and verify detected integrations in generated trusted-server.toml
  • Run with --diff against an existing js-assets.toml and verify confirmed/new/missing
  • Run without trusted-server.toml (e.g., from /tmp) and verify domain inference
  • Load as Claude Code plugin and test the skill
  • Verify --config without --force errors when file already exists
  • Verify bot-protected sites work in headed mode (e.g., autoblog.com)

Closes #631

jevansnyc and others added 12 commits April 1, 2026 13:17
Engineering spec for the /audit-js-assets .
Covers sweep protocol, Chrome DevTools MCP tooling, heuristic filtering,
slug generation, init and diff modes.

Closes #606
Fix incorrect MCP tool name prefix, replace misused wait_for with
evaluate_script setTimeout, correct list_network_requests filtering to
use resourceTypes, resolve path derivation contradiction with consistent
/js-assets/{prefix}/{stem}.js formula, pin slug separator and base62
charset, add URL Processing section with normalization rules and
first-party boundary definition, tighten wildcard regex to require mixed
character classes, and move skill location to .claude/commands/.
Implement the /audit-js-assets command that sweeps a publisher page via
Chrome DevTools MCP, detects third-party JS assets, and generates
js-assets.toml entries. Includes a shared slug generation script
(SHA-256 + base62) and adds MCP permission grants for navigate_page,
list_network_requests, and close_page.
Move URL normalization, filtering, wildcard detection, slug generation,
and TOML formatting into scripts/audit-js-assets.mjs. The skill now
collects raw browser data and delegates processing to the script,
replacing fragile LLM-side URL manipulation.

Expand heuristic filter with Google ad rendering, ad fraud detection,
ad verification, and reCAPTCHA categories. Auto-include target URL host
as first-party. Add --no-filter flag. Fix semver regex to match alpha
suffixes like 1.19.8-hcskhn.
Replace MCP-driven browser automation with a standalone Playwright CLI
at tools/js-asset-auditor/audit.mjs. One command sweeps a publisher
page, collects script URLs, processes them through the shared pipeline,
and writes js-assets.toml.

Refactor scripts/audit-js-assets.mjs to export processAssets() so both
the stdin-based pipeline and the Playwright CLI share the same processing
logic. Simplify the Claude skill from 115 to 59 lines — it now calls the
CLI and formats the JSON summary.
Rewrite sweep protocol, implementation, and verification sections to
describe the three-component architecture: Playwright CLI, processing
library, and Claude Code skill wrapper. Add direct CLI invocation
examples, --headed flag, first-party auto-detection verification, and
ad-rendering filter verification steps.
Restructure into packages/js-asset-auditor/ as a self-contained Claude
Code plugin with .claude-plugin/plugin.json manifest, skills/ directory,
bin/ executable, and lib/ processing modules. The plugin provides the
audit-js-assets skill and CLI automatically when enabled.

Remove tools/js-asset-auditor/, scripts/audit-js-assets.mjs, and
.claude/commands/audit-js-assets.md — all replaced by the plugin.
Enables installing the JS Asset Auditor plugin from this repo via
/plugin marketplace add <org>/trusted-server followed by
/plugin install js-asset-auditor.
Add --domain flag and fall back to inferring from the target URL when
trusted-server.toml is not present. Enables using the plugin in any
project without project-specific config.
Reflect the plugin layout at packages/js-asset-auditor/, update all file
paths, document the domain resolution fallback chain (--domain flag >
trusted-server.toml > infer from URL), and update skill invocation to
use the namespaced /js-asset-auditor:audit-js-assets format.
New --config [path] flag auto-detects integrations (GPT, GTM, Didomi,
DataDome, Lockr, Permutive, Prebid, APS) from swept script URLs and
generates a trusted-server.toml with appropriate [integrations.*]
sections. Auto-extracts fields like GTM container_id from query params
and Permutive org/workspace IDs from URL paths. Fields needing manual
input are marked with TODO comments.
Switch from headless-by-default to headed-by-default. Sites with bot
protection (DataDome, Cloudflare, etc.) block headless browsers. The
--headed flag becomes --headless for CI/automation use cases.
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Additive Claude Code plugin (packages/js-asset-auditor/) with a Playwright-based CLI for sweeping publisher pages, a processing library, and integration auto-detection. No Rust changes. Review uncovered three blocking issues: the format-docs CI gate is red, the generated trusted-server.toml emits an invalid bidders = "" for Prebid, and the CLI arg parser silently consumes the next flag when a value is omitted. A handful of non-blocking refactor/hardening suggestions follow.

Blocking

🔧 wrench

  • format-docs CI failing: prettier wants table column realignment in docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md (heuristic filter table and detection patterns table). Fix: cd docs && npx prettier --write superpowers/specs/2026-04-01-js-asset-auditor-design.md. (inline at line 100)
  • Generated bidders = "" is invalid for Rust Prebid config: PrebidConfig::bidders is Vec<String> (crates/trusted-server-core/src/integrations/prebid.rs:69). Users who flip enabled = true on a generated [integrations.prebid] block get a deserialization failure. (inline at detect.mjs:266)
  • --first-party / --settle / --output / --domain consume the next flag: passing a value-taking flag without a value silently swallows the following arg. Reproduced. (inline at audit.mjs:84)

Non-blocking

♻️ refactor

  • Slug algorithm duplicated in scripts/js-asset-slug.mjs — make it import from packages/js-asset-auditor/lib/process.mjs. Silent drift here breaks KV lookups against the Rust proxy. (inline at scripts/js-asset-slug.mjs:78)
  • Stale help block in audit.mjs — references --headed (not implemented; actual flag is --headless) and omits --config, --force, --domain. (inline at audit.mjs:18)

🤔 thinking

  • readPublisherDomain conflates parse errors with missing file — a malformed but present trusted-server.toml silently falls through to URL-inferred domain, producing wrong slugs (publisher prefix depends on domain). Distinguish ENOENT from parse failure. (inline at audit.mjs:142)
  • --config existence check uses readFileSync — use fs.existsSync. (inline at audit.mjs:238)
  • No unit tests for process.mjs / detect.mjs — both have nontrivial logic (wildcard regex, first-party boundary, heuristic filter with two match shapes, integration field extraction). Vitest already runs in crates/js/lib; adding fixtures there (or alongside the plugin) would catch drift, especially for the slug hash that must match the Rust proxy.

⛏ nitpick

  • GTM match uses pathname.includes("/gtm.js").endsWith("/gtm.js") removes theoretical ambiguity. (inline at detect.mjs:37)
  • formatTomlValue doesn't escape " / \ — fine today since inputs are static, but JSON.stringify on the string branch is free escape handling for future patterns. (inline at detect.mjs:226)

🌱 seedling

  • Cross-language slug fixture test — the spec claims the JS slug must produce identical output to the Rust proxy's KV key derivation, but the Rust proxy lives on a separate branch. Once it lands, a shared-fixture test (same {domain, url} → same slug in both JS and Rust) is the only reliable guard against silent drift. Worth a follow-up issue.

CI Status

  • cargo fmt: PASS
  • cargo clippy: PASS
  • rust tests: PASS
  • vitest (crates/js/lib): PASS
  • format-typescript: PASS
  • format-docs: FAIL (see 🔧 above)

Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Comment thread docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md Outdated
Comment thread scripts/js-asset-slug.mjs Outdated
Comment thread packages/js-asset-auditor/lib/audit.mjs
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated
Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Copy link
Copy Markdown
Collaborator

@prk-Jr prk-Jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This plugin is close, but there are three blocking behavior issues in the CLI/config pipeline that can mislead operators or produce unusable output. CI is green, but these cases need to be fixed before the tool is safe to rely on for onboarding and diff workflows.

Blocking

🔧 wrench

Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated
Comment thread packages/js-asset-auditor/lib/process.mjs
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Re-review against main after 66951b9. That commit cleanly addresses every finding from my prior review (format-docs, bidders = "", arg-parser flag-swallowing, slug duplication, stale help block, existsSync, GTM endsWith, TOML escaping, ENOENT-vs-parse distinction), with focused regression tests for each.

However, the 2026-04-22 review from @prk-Jr is not addressed by the latest push — I confirmed all three of their blockers in the worktree — and I found two additional blocking-class issues while re-reading the 18 changed files.

Blocking

🔧 wrench

  • enabled = true for partial / detect_only integrations — generated config boot-fails on missing required fields (server_url, pub_id, app_id). packages/js-asset-auditor/lib/detect.mjs:260 (inline).
  • Diff mode re-appends the same NEW assets on repeated runs — reproduced: same asset appears 6× after 3 runs because parseExistingToml ignores commented suggestions. packages/js-asset-auditor/lib/process.mjs:214 (inline).
  • --config collision exits 0 with success summary — violates spec, breaks automation. packages/js-asset-auditor/lib/audit.mjs:269 (inline).
  • --config default path collides with live trusted-server.toml — with --force, silently overwrites the real publisher config with a TODO skeleton. packages/js-asset-auditor/lib/audit.mjs:136 (inline).
  • Plugin unit tests are not wired into CI.github/workflows/test.yml only runs vitest in crates/js/lib. The 8 regression tests added in 66951b9 (the sole guard against drift on slug algorithm, integration detection, arg parsing) run locally only, so regressions will land on main unnoticed. Add a step that runs cd packages/js-asset-auditor && npm ci && npm test on PRs touching this directory, or unconditionally.

Non-blocking

🤔 thinking

  • No diff-idempotence testprocessAssets(..., { diff: true }) run twice against its own output should yield summary.new.length === 0. A fixture test in packages/js-asset-auditor/test/process.test.mjs would have caught the re-append regression and will keep catching it.
  • SKILL.md still says "headless browser"packages/js-asset-auditor/skills/audit-js-assets/SKILL.md:24. The CLI has defaulted to headed since e0c7e0c. Misleading for sandbox users.
  • data: / blob: script URLs produce "null" origin in slugsnew URL("data:…").origin === "null" so applyWildcards yields null/<pathname>. Rare for <script src>, but one protocol guard is cheap.

♻️ refactor

  • readPublisherDomain hand-rolls a TOML scan (lib/audit.mjs:37-61) — rejects single-quoted strings and any non-^domain = "…" shape. A small TOML lib would be more robust. Non-urgent.

⛏ nitpick

  • APS pattern uses pathname.includes("/apstag") (lib/detect.mjs:152) — same class of ambiguity as the GTM case tightened in 66951b9. Consider anchoring.

CI Status

  • cargo fmt: PASS
  • cargo clippy / Analyze (rust): PASS
  • cargo test: PASS
  • vitest (crates/js/lib): PASS
  • format-typescript: PASS
  • format-docs: PASS
  • browser integration tests: PASS
  • CodeQL: PASS
  • plugin tests (packages/js-asset-auditor/test): NOT WIRED INTO CI — see blocking #5

Comment thread packages/js-asset-auditor/lib/detect.mjs Outdated
Comment thread packages/js-asset-auditor/lib/process.mjs
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Comment thread packages/js-asset-auditor/lib/audit.mjs Outdated
Copy link
Copy Markdown
Collaborator

@aram356 aram356 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Prints test.publisher.com in place of actual domain because it reads from toml file

🔧 Please add guide documentation

@ChristianPavilonis
Copy link
Copy Markdown
Collaborator Author

Addressed the remaining review feedback in 00ce787 and replied/resolved the inline threads:

  • incomplete integrations are no longer auto-enabled
  • diff mode is now idempotent across repeated runs
  • --config collisions now fail non-zero before the success summary
  • no-arg --config now writes to trusted-server.generated.toml
  • plugin tests are wired into CI
  • added docs/guide/js-asset-auditor.md
  • clarified domain precedence and source selection in the CLI/skill/docs

Validation run after the changes:

  • cd packages/js-asset-auditor && npm test
  • cd docs && npm run format
  • cargo test --workspace
  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings

Re-requesting review from @aram356 and @prk-Jr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement JS Auditor - claude code

4 participants