fix(lifecycle): prevent hello rate-limit death spiral on secret-rotation failure#465
Open
linboxin wants to merge 1 commit intoEvoMap:mainfrom
Open
fix(lifecycle): prevent hello rate-limit death spiral on secret-rotation failure#465linboxin wants to merge 1 commit intoEvoMap:mainfrom
linboxin wants to merge 1 commit intoEvoMap:mainfrom
Conversation
…ion failure Fixes the regression described in EvoMap#349 still present in v1.69.x. Three client-side bugs combined to exhaust the 60 hello/hour rate limit: 1. hello() did not check res.ok before processing the response body. A 429 reply fell through and returned { ok: true } with no secret, making reAuthenticate() believe the call succeeded. 2. reAuthenticate() used `continue` when hello returned ok but no node_secret. Because of bug (1) this path was hit on every 429, so all MAX_REAUTH_ATTEMPTS were consumed with no back-off. 3. After exhausting attempts reAuthenticate() returned false with no cooldown, so the next heartbeat tick re-entered it immediately. Fix: - hello() now checks !res.ok first; 429 sets _helloRateLimitUntil (honouring Retry-After, defaulting to 3600 s) and returns { ok: false, error: 'hello_rate_limited' }. Subsequent calls within the window are suppressed before any network I/O. - reAuthenticate() breaks (not continues) when hub returns ok but no secret — retrying won't fix a hub-side rotation failure. - reAuthenticate() also breaks immediately on hello_rate_limited / hello_rate_limit_active errors. - On exhausting all attempts, _reauthBackoffUntil is set for 30 min, blocking re-entry from heartbeat ticks or proxy HTTP callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the
node_secret_invalid→ rate-limit death spiral still present in v1.69.12+. Three client-side bugs insrc/proxy/lifecycle/manager.jscausedhello()to be called in an unbounded loop until the 60/hour rate limit was exhausted.What changed
hello()now checks!res.okbefore parsing the response body. A 429 sets_helloRateLimitUntil(honouringRetry-After, defaulting to 3600s) and returns{ ok: false, error: 'hello_rate_limited' }. Subsequent calls within the window are suppressed before any network I/O.reAuthenticate()now breaks (not continues) when hub returns ok but nonode_secret— retrying a hub-side rotation failure only burns hello quota.reAuthenticate()breaks immediately on rate-limit errors fromhello()._reauthBackoffUntilis set for 30 minutes, blocking re-entry from heartbeat ticks and proxy HTTP callers.How to test
node index.js— confirm no errors on startuphello()should fire once, then backoff for 1800s_helloRateLimitUntilshould be set, subsequent calls suppressedRisk
Low — only affects the re-auth failure path; normal heartbeat flow is unchanged.
Related
Closes #464