Allow private addresses for explicit URLs in fedify lookup#698
Allow private addresses for explicit URLs in fedify lookup#698dahlia merged 4 commits intofedify-dev:2.1-maintenancefrom
Conversation
Since v2.1, `fedify lookup` rejected localhost URLs unless
`-p`/`--allow-private-address` was passed, because the CLI began
forwarding `allowPrivateAddress=false` to the vocab-runtime document
loader, whose `validatePublicUrl` check blocks loopback addresses.
Split the document/auth loaders into two:
- An "initial" loader that always allows private addresses, used for
URLs explicitly provided on the command line (plain lookup, the
first fetch in `--traverse`, and the first fetch in `--recurse`).
- The existing loader, which continues to honor
`--allow-private-address`, used for URLs discovered from remote
responses (traversal pages and recursion targets).
This preserves SSRF protection against `http://localhost/...` URLs
embedded in remote `first`/`next`/`inReplyTo` fields while letting
users look up local servers without extra flags.
Assisted-by: Claude Code:claude-opus-4-7
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe lookup command now distinguishes between user-provided URLs and discovered URLs during traversal. User-provided URLs always bypass private-address restrictions using dedicated initial loaders, while discovered URLs during recursion/traversal respect the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Code Review
This pull request updates the lookup command to allow private IP addresses for URLs explicitly provided by the user, while ensuring that URLs discovered via traversal or recursion still respect the --allow-private-address flag. It introduces dedicated initial loaders for the primary lookup phase. Feedback indicates a potential inconsistency where the recursion logic may still hardcode private address restrictions, which would conflict with the updated documentation for the CLI option.
| description: message`Allow private IP addresses for URLs discovered \ | ||
| via traversal or recursion. URLs explicitly provided \ | ||
| on the command line always allow private addresses.`, |
There was a problem hiding this comment.
The updated description for the --allow-private-address option states that it applies to URLs discovered via both traversal and recursion. However, while the traversal logic (lines 1185-1186) correctly follows this option, the recursion logic (lines 884-919 in the full file) still hardcodes allowPrivateAddress: false for its document and auth loaders. This creates a precedence issue where the hardcoded value overrides the CLI configuration. To ensure consistency with the description and the PR's stated goal, the recurse block should be updated to use loaders that honor the command.allowPrivateAddress option instead of hardcoding false.
References
- When handling CLI options that bind to configuration, ensure the implementation correctly accounts for the precedence between CLI arguments and hardcoded defaults to avoid the CLI flag being ignored.
There was a problem hiding this comment.
Pull request overview
Adjusts the fedify lookup CLI to allow loopback/private addresses for explicit URLs passed on the command line (restoring pre-v2.1 ergonomics) while keeping SSRF protections for URLs discovered during traversal/recursion.
Changes:
- Introduces “initial” document/auth loaders that always set
allowPrivateAddress: truefor user-supplied URLs. - Keeps the existing loaders (honoring
--allow-private-address) for URLs discovered through traversal/recursion. - Updates the
--allow-private-addressflag description to reflect the new behavior.
| via traversal or recursion. URLs explicitly provided \ | ||
| on the command line always allow private addresses.`, |
| // URLs discovered via traversal or recursion follow the option, since they | ||
| // originate from remote responses and must be protected against SSRF. |
| const initialBaseDocumentLoader = await getDocumentLoader({ | ||
| userAgent: command.userAgent, | ||
| allowPrivateAddress: true, | ||
| }); | ||
| const initialDocumentLoader = wrapDocumentLoaderWithTimeout( | ||
| initialBaseDocumentLoader, | ||
| command.timeout, | ||
| ); |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/cli/src/lookup.ts (1)
720-870: 🧹 Nitpick | 🔵 TrivialConsider extracting helpers to avoid duplicating loader construction.
The "initial" loaders now duplicate the full option bags used by the existing loaders, differing only in
allowPrivateAddress.baseAuthLoader(lines 835–848) andinitialBaseAuthLoader(lines 853–866) in particular repeat the entirespecDeterminerobject verbatim, which is easy to get out of sync on a future change (e.g., iffirstKnock/userAgenthandling changes, you'd have to update two places).A small helper keyed on
allowPrivateAddresswould localize the policy choice and make the intent clearer at the call sites:♻️ Proposed refactor
+ const makeDocumentLoader = async (allowPrivateAddress: boolean) => + wrapDocumentLoaderWithTimeout( + await getDocumentLoader({ + userAgent: command.userAgent, + allowPrivateAddress, + }), + command.timeout, + ); + - const initialBaseDocumentLoader = await getDocumentLoader({ - userAgent: command.userAgent, - allowPrivateAddress: true, - }); - const initialDocumentLoader = wrapDocumentLoaderWithTimeout( - initialBaseDocumentLoader, - command.timeout, - ); - const baseDocumentLoader = await getDocumentLoader({ - userAgent: command.userAgent, - allowPrivateAddress: command.allowPrivateAddress, - }); - const documentLoader = wrapDocumentLoaderWithTimeout( - baseDocumentLoader, - command.timeout, - ); + const initialDocumentLoader = await makeDocumentLoader(true); + const documentLoader = await makeDocumentLoader(command.allowPrivateAddress);And similarly for the authenticated loader:
+ const makeAuthLoader = (allowPrivateAddress: boolean) => + wrapDocumentLoaderWithTimeout( + getAuthenticatedDocumentLoader(authIdentity!, { + allowPrivateAddress, + userAgent: command.userAgent, + specDeterminer: { + determineSpec: () => command.firstKnock, + rememberSpec() {}, + }, + }), + command.timeout, + ); + authLoader = makeAuthLoader(command.allowPrivateAddress); + initialAuthLoader = makeAuthLoader(true);The same helper can be reused to build
recursiveAuthLoader(currently at lines 900–919) withfalse.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cli/src/lookup.ts` around lines 720 - 870, The loader construction is duplicated between "initial" and regular loaders (e.g., initialBaseDocumentLoader/baseDocumentLoader, initialBaseAuthLoader/baseAuthLoader, initialAuthLoader/authLoader) differing only by allowPrivateAddress and causing repeated specDeterminer and option bags; refactor by extracting small factory helpers (e.g., makeDocumentLoader(optionsAllowPrivate: boolean) and makeAuthenticatedDocumentLoader(allowPrivate: boolean)) that call getDocumentLoader/getContextLoader/getAuthenticatedDocumentLoader with the common option fields (userAgent, timeout handling, specDeterminer that returns command.firstKnock, etc.), then replace direct calls to initialBaseDocumentLoader/baseDocumentLoader and baseAuthLoader/initialBaseAuthLoader with calls to these helpers (and wrap with wrapDocumentLoaderWithTimeout where needed) so the only difference at call sites is the boolean allowPrivateAddress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/cli/src/lookup.ts`:
- Around line 86-88: Update the help text for the -p/--allow-private-address
flag (the description message string) to explicitly state that the flag only
affects URLs discovered via --traverse or --recurse and is effectively a no-op
for plain lookup, which uses initialAuthLoader ?? initialDocumentLoader for all
fetches; reference -p/--allow-private-address, the description message template,
and the plain lookup code path using initialAuthLoader/initialDocumentLoader so
users know the flag only applies to traversal/recursion.
---
Outside diff comments:
In `@packages/cli/src/lookup.ts`:
- Around line 720-870: The loader construction is duplicated between "initial"
and regular loaders (e.g., initialBaseDocumentLoader/baseDocumentLoader,
initialBaseAuthLoader/baseAuthLoader, initialAuthLoader/authLoader) differing
only by allowPrivateAddress and causing repeated specDeterminer and option bags;
refactor by extracting small factory helpers (e.g.,
makeDocumentLoader(optionsAllowPrivate: boolean) and
makeAuthenticatedDocumentLoader(allowPrivate: boolean)) that call
getDocumentLoader/getContextLoader/getAuthenticatedDocumentLoader with the
common option fields (userAgent, timeout handling, specDeterminer that returns
command.firstKnock, etc.), then replace direct calls to
initialBaseDocumentLoader/baseDocumentLoader and
baseAuthLoader/initialBaseAuthLoader with calls to these helpers (and wrap with
wrapDocumentLoaderWithTimeout where needed) so the only difference at call sites
is the boolean allowPrivateAddress.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4f9db069-6a31-47de-ba22-0d21beca161f
📒 Files selected for processing (1)
packages/cli/src/lookup.ts
| description: message`Allow private IP addresses for URLs discovered \ | ||
| via traversal or recursion. URLs explicitly provided \ | ||
| on the command line always allow private addresses.`, |
There was a problem hiding this comment.
Help text is accurate, but note -p is effectively a no-op in plain lookup mode.
With the new split, the flag only affects URLs reached via --traverse (next/prev pages) and --recurse (relationship targets). In plain lookup (the non-traverse, non-recurse path at line 1283), every fetch now goes through initialAuthLoader ?? initialDocumentLoader, so -p/--allow-private-address has no observable effect there.
The current wording is technically correct, but a user running fedify lookup <url> -p in plain mode won't get any different behavior than without -p. Consider whether that's acceptable or whether the help text/docs should call it out explicitly (e.g., "only meaningful with --traverse or --recurse").
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/cli/src/lookup.ts` around lines 86 - 88, Update the help text for
the -p/--allow-private-address flag (the description message string) to
explicitly state that the flag only affects URLs discovered via --traverse or
--recurse and is effectively a no-op for plain lookup, which uses
initialAuthLoader ?? initialDocumentLoader for all fetches; reference
-p/--allow-private-address, the description message template, and the plain
lookup code path using initialAuthLoader/initialDocumentLoader so users know the
flag only applies to traversal/recursion.
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
The option's help text now notes that it only has an effect when combined with -t/--traverse or --recurse, since URLs explicitly provided on the command line always allow private addresses. fedify-dev#696 fedify-dev#698 Assisted-by: Claude Code:claude-opus-4-7
Update CHANGES.md with the 2.1.6 entry for the regression fix, and rewrite the -p/--allow-private-address section in docs/cli.md to reflect that URLs given on the command line always allow private addresses while the option gates URLs discovered via --traverse or --recurse. fedify-dev#696 fedify-dev#698 Assisted-by: Claude Code:claude-opus-4-7
The previous help text, inline comment, changelog entry, and CLI docs all stated that -p/--allow-private-address affects URLs discovered via --recurse, but recursive fetches hardcode `allowPrivateAddress: false` and never honor the option. Update the CLI help, the loader-split comment in `runLookup`, the 2.1.6 changelog entry, and the `-p` section of `docs/cli.md` so that they all describe the same actual behavior: the option only gates URLs discovered during --traverse, while --recurse always disallows private addresses. fedify-dev#696 fedify-dev#698 Assisted-by: Claude Code:claude-opus-4-7
Closes #696.
Since v2.1,
fedify lookuprejected localhost URLs unless-p/--allow-private-addresswas passed, because the CLI began forwardingallowPrivateAddress=falseto the vocab-runtime document loader, whosevalidatePublicUrlcheck blocks loopback addresses.Split the document/auth loaders into two:
--traverse, and the first fetch in--recurse).--allow-private-address, used for URLs discovered from remote responses (traversal pages and recursion targets).This preserves SSRF protection against
http://localhost/...URLs embedded in remotefirst/next/inReplyTofields while letting users look up local servers without extra flags.Assisted-by: Claude Code:claude-opus-4-7