Add JS Asset Auditor plugin with Playwright CLI#633
Add JS Asset Auditor plugin with Playwright CLI#633ChristianPavilonis wants to merge 14 commits intomainfrom
Conversation
Engineering spec for the /audit-js-assets . Covers sweep protocol, Chrome DevTools MCP tooling, heuristic filtering, slug generation, init and diff modes. Closes #606
Fix incorrect MCP tool name prefix, replace misused wait_for with
evaluate_script setTimeout, correct list_network_requests filtering to
use resourceTypes, resolve path derivation contradiction with consistent
/js-assets/{prefix}/{stem}.js formula, pin slug separator and base62
charset, add URL Processing section with normalization rules and
first-party boundary definition, tighten wildcard regex to require mixed
character classes, and move skill location to .claude/commands/.
Implement the /audit-js-assets command that sweeps a publisher page via Chrome DevTools MCP, detects third-party JS assets, and generates js-assets.toml entries. Includes a shared slug generation script (SHA-256 + base62) and adds MCP permission grants for navigate_page, list_network_requests, and close_page.
Move URL normalization, filtering, wildcard detection, slug generation, and TOML formatting into scripts/audit-js-assets.mjs. The skill now collects raw browser data and delegates processing to the script, replacing fragile LLM-side URL manipulation. Expand heuristic filter with Google ad rendering, ad fraud detection, ad verification, and reCAPTCHA categories. Auto-include target URL host as first-party. Add --no-filter flag. Fix semver regex to match alpha suffixes like 1.19.8-hcskhn.
Replace MCP-driven browser automation with a standalone Playwright CLI at tools/js-asset-auditor/audit.mjs. One command sweeps a publisher page, collects script URLs, processes them through the shared pipeline, and writes js-assets.toml. Refactor scripts/audit-js-assets.mjs to export processAssets() so both the stdin-based pipeline and the Playwright CLI share the same processing logic. Simplify the Claude skill from 115 to 59 lines — it now calls the CLI and formats the JSON summary.
Rewrite sweep protocol, implementation, and verification sections to describe the three-component architecture: Playwright CLI, processing library, and Claude Code skill wrapper. Add direct CLI invocation examples, --headed flag, first-party auto-detection verification, and ad-rendering filter verification steps.
Restructure into packages/js-asset-auditor/ as a self-contained Claude Code plugin with .claude-plugin/plugin.json manifest, skills/ directory, bin/ executable, and lib/ processing modules. The plugin provides the audit-js-assets skill and CLI automatically when enabled. Remove tools/js-asset-auditor/, scripts/audit-js-assets.mjs, and .claude/commands/audit-js-assets.md — all replaced by the plugin.
Enables installing the JS Asset Auditor plugin from this repo via /plugin marketplace add <org>/trusted-server followed by /plugin install js-asset-auditor.
Add --domain flag and fall back to inferring from the target URL when trusted-server.toml is not present. Enables using the plugin in any project without project-specific config.
Reflect the plugin layout at packages/js-asset-auditor/, update all file paths, document the domain resolution fallback chain (--domain flag > trusted-server.toml > infer from URL), and update skill invocation to use the namespaced /js-asset-auditor:audit-js-assets format.
New --config [path] flag auto-detects integrations (GPT, GTM, Didomi, DataDome, Lockr, Permutive, Prebid, APS) from swept script URLs and generates a trusted-server.toml with appropriate [integrations.*] sections. Auto-extracts fields like GTM container_id from query params and Permutive org/workspace IDs from URL paths. Fields needing manual input are marked with TODO comments.
Switch from headless-by-default to headed-by-default. Sites with bot protection (DataDome, Cloudflare, etc.) block headless browsers. The --headed flag becomes --headless for CI/automation use cases.
aram356
left a comment
There was a problem hiding this comment.
Summary
Additive Claude Code plugin (packages/js-asset-auditor/) with a Playwright-based CLI for sweeping publisher pages, a processing library, and integration auto-detection. No Rust changes. Review uncovered three blocking issues: the format-docs CI gate is red, the generated trusted-server.toml emits an invalid bidders = "" for Prebid, and the CLI arg parser silently consumes the next flag when a value is omitted. A handful of non-blocking refactor/hardening suggestions follow.
Blocking
🔧 wrench
- format-docs CI failing: prettier wants table column realignment in
docs/superpowers/specs/2026-04-01-js-asset-auditor-design.md(heuristic filter table and detection patterns table). Fix:cd docs && npx prettier --write superpowers/specs/2026-04-01-js-asset-auditor-design.md. (inline at line 100) - Generated
bidders = ""is invalid for Rust Prebid config:PrebidConfig::biddersisVec<String>(crates/trusted-server-core/src/integrations/prebid.rs:69). Users who flipenabled = trueon a generated[integrations.prebid]block get a deserialization failure. (inline atdetect.mjs:266) --first-party/--settle/--output/--domainconsume the next flag: passing a value-taking flag without a value silently swallows the following arg. Reproduced. (inline ataudit.mjs:84)
Non-blocking
♻️ refactor
- Slug algorithm duplicated in
scripts/js-asset-slug.mjs— make it import frompackages/js-asset-auditor/lib/process.mjs. Silent drift here breaks KV lookups against the Rust proxy. (inline atscripts/js-asset-slug.mjs:78) - Stale help block in
audit.mjs— references--headed(not implemented; actual flag is--headless) and omits--config,--force,--domain. (inline ataudit.mjs:18)
🤔 thinking
readPublisherDomainconflates parse errors with missing file — a malformed but presenttrusted-server.tomlsilently falls through to URL-inferred domain, producing wrong slugs (publisher prefix depends on domain). DistinguishENOENTfrom parse failure. (inline ataudit.mjs:142)--configexistence check usesreadFileSync— usefs.existsSync. (inline ataudit.mjs:238)- No unit tests for
process.mjs/detect.mjs— both have nontrivial logic (wildcard regex, first-party boundary, heuristic filter with two match shapes, integration field extraction). Vitest already runs incrates/js/lib; adding fixtures there (or alongside the plugin) would catch drift, especially for the slug hash that must match the Rust proxy.
⛏ nitpick
- GTM match uses
pathname.includes("/gtm.js")—.endsWith("/gtm.js")removes theoretical ambiguity. (inline atdetect.mjs:37) formatTomlValuedoesn't escape"/\— fine today since inputs are static, butJSON.stringifyon the string branch is free escape handling for future patterns. (inline atdetect.mjs:226)
🌱 seedling
- Cross-language slug fixture test — the spec claims the JS slug must produce identical output to the Rust proxy's KV key derivation, but the Rust proxy lives on a separate branch. Once it lands, a shared-fixture test (same
{domain, url}→ same slug in both JS and Rust) is the only reliable guard against silent drift. Worth a follow-up issue.
CI Status
- cargo fmt: PASS
- cargo clippy: PASS
- rust tests: PASS
- vitest (
crates/js/lib): PASS - format-typescript: PASS
- format-docs: FAIL (see 🔧 above)
prk-Jr
left a comment
There was a problem hiding this comment.
Summary
This plugin is close, but there are three blocking behavior issues in the CLI/config pipeline that can mislead operators or produce unusable output. CI is green, but these cases need to be fixed before the tool is safe to rely on for onboarding and diff workflows.
Blocking
🔧 wrench
- Generated configs enable integrations before required fields are present (packages/js-asset-auditor/lib/detect.mjs:260)
- Diff mode keeps re-appending the same NEW assets on repeated runs (packages/js-asset-auditor/lib/process.mjs:214)
--configconflict only logs an error and still exits successfully (packages/js-asset-auditor/lib/audit.mjs:266)
aram356
left a comment
There was a problem hiding this comment.
Summary
Re-review against main after 66951b9. That commit cleanly addresses every finding from my prior review (format-docs, bidders = "", arg-parser flag-swallowing, slug duplication, stale help block, existsSync, GTM endsWith, TOML escaping, ENOENT-vs-parse distinction), with focused regression tests for each.
However, the 2026-04-22 review from @prk-Jr is not addressed by the latest push — I confirmed all three of their blockers in the worktree — and I found two additional blocking-class issues while re-reading the 18 changed files.
Blocking
🔧 wrench
enabled = trueforpartial/detect_onlyintegrations — generated config boot-fails on missing required fields (server_url,pub_id,app_id).packages/js-asset-auditor/lib/detect.mjs:260(inline).- Diff mode re-appends the same NEW assets on repeated runs — reproduced: same asset appears 6× after 3 runs because
parseExistingTomlignores commented suggestions.packages/js-asset-auditor/lib/process.mjs:214(inline). --configcollision exits 0 with success summary — violates spec, breaks automation.packages/js-asset-auditor/lib/audit.mjs:269(inline).--configdefault path collides with livetrusted-server.toml— with--force, silently overwrites the real publisher config with a TODO skeleton.packages/js-asset-auditor/lib/audit.mjs:136(inline).- Plugin unit tests are not wired into CI —
.github/workflows/test.ymlonly runs vitest incrates/js/lib. The 8 regression tests added in 66951b9 (the sole guard against drift on slug algorithm, integration detection, arg parsing) run locally only, so regressions will land onmainunnoticed. Add a step that runscd packages/js-asset-auditor && npm ci && npm teston PRs touching this directory, or unconditionally.
Non-blocking
🤔 thinking
- No diff-idempotence test —
processAssets(..., { diff: true })run twice against its own output should yieldsummary.new.length === 0. A fixture test inpackages/js-asset-auditor/test/process.test.mjswould have caught the re-append regression and will keep catching it. - SKILL.md still says "headless browser" —
packages/js-asset-auditor/skills/audit-js-assets/SKILL.md:24. The CLI has defaulted to headed since e0c7e0c. Misleading for sandbox users. data:/blob:script URLs produce"null"origin in slugs —new URL("data:…").origin === "null"soapplyWildcardsyieldsnull/<pathname>. Rare for<script src>, but one protocol guard is cheap.
♻️ refactor
readPublisherDomainhand-rolls a TOML scan (lib/audit.mjs:37-61) — rejects single-quoted strings and any non-^domain = "…"shape. A small TOML lib would be more robust. Non-urgent.
⛏ nitpick
- APS pattern uses
pathname.includes("/apstag")(lib/detect.mjs:152) — same class of ambiguity as the GTM case tightened in 66951b9. Consider anchoring.
CI Status
- cargo fmt: PASS
- cargo clippy / Analyze (rust): PASS
- cargo test: PASS
- vitest (
crates/js/lib): PASS - format-typescript: PASS
- format-docs: PASS
- browser integration tests: PASS
- CodeQL: PASS
- plugin tests (
packages/js-asset-auditor/test): NOT WIRED INTO CI — see blocking #5
|
Addressed the remaining review feedback in 00ce787 and replied/resolved the inline threads:
Validation run after the changes:
|
Summary
packages/js-asset-auditor/with a standalone Playwright CLI that sweeps publisher pages for third-party JS assetstrusted-server.tomlconfig with--configflagtrusted-server.tomloptional with--domainflag for portabilityTry it out
1. Check out the branch
2. Install dependencies
3. Run the CLI directly
4. Use as a Claude Code plugin
Then in Claude Code:
CLI flags
--diffjs-assets.toml--settle <ms>--first-party <hosts>--domain <host>trusted-server.tomlor URL)--no-filter--headless--output <path>js-assets.toml)--config [path]trusted-server.tomlwith detected integrations--forceTest plan
js-assets.tomloutput--configand verify detected integrations in generatedtrusted-server.toml--diffagainst an existingjs-assets.tomland verify confirmed/new/missingtrusted-server.toml(e.g., from/tmp) and verify domain inference--configwithout--forceerrors when file already existsCloses #631