Skip to content

Best practices for production deployments #689

@dahlia

Description

@dahlia

Fedify's documentation currently concentrates production-related content in a single page, docs/manual/deploy.md, which covers platform-specific setup (Node.js/Bun/Deno, Cloudflare Workers, Deno Deploy). Two gaps:

  1. Platform coverage is thin and partly outdated. Deno Deploy went through a major renewal and the existing section no longer reflects current practice. Cloudflare Workers deserves more depth than the current page provides. There is no runtime/platform selection guide for operators choosing between Node.js, Bun, Deno, and Cloudflare Workers.
  2. No systematic operational guidance. There is no structured coverage of security hardening, reliability, or observability for operators running Fedify-based services in production.

We should restructure deployment docs as a top-level category and fill both gaps.

Proposed restructure

Lift Deployment out of Manual and make it a top-level nav category, alongside Installation, CLI, Tutorials, Manual, and References. The existing docs/manual/deploy.md becomes a redirect stub, mirroring docs/tutorial.md, which only links into docs/tutorial/basics.md.

  • docs/deploy.md: Landing page for the new top-level nav. Hosts the runtime/platform selection guide and links into the subpages below.
  • docs/deploy/traditional.md: Traditional long-running server deployments on Node.js, Bun, and Deno (migrated from docs/manual/deploy.md, lightly expanded).
  • docs/deploy/cloudflare-workers.md: Cloudflare Workers deployments, rewritten with more depth than the current section.
  • docs/deploy/deno-deploy.md: Deno Deploy, rewritten to reflect the recent platform renewal.
  • docs/deploy/security.md: Security hardening best practices.
  • docs/deploy/reliability.md: Reliability and operations best practices.
  • docs/deploy/observability.md: Observability in production.
  • docs/manual/deploy.md: Reduced to a redirect stub pointing at docs/deploy.md.

More topic pages (e.g. performance.md, keys.md, interop.md) can be added later if the initial three grow unwieldy.

Scope per page

Runtime and platform selection (docs/deploy.md)

Comparison of Node.js, Bun, Deno, Cloudflare Workers, and Deno Deploy as deployment targets. Covers:

  • Execution model differences (long-running process vs. isolate-per-request)
  • Persistence options available on each platform (KV store, message queue)
  • Operational tradeoffs: cold starts, execution time limits, binding-based configuration
  • Ecosystem considerations: npm compatibility, native module support
  • A decision-oriented summary: when each target is a natural fit

Traditional server deployments (docs/deploy/traditional.md)

  • Node.js with @hono/node-server adapter; process managers (PM2, systemd)
  • Bun and Deno built-in fetch-style servers
  • Key-value store and message queue selection for long-running servers
  • Key pair persistence across restarts

Cloudflare Workers (docs/deploy/cloudflare-workers.md)

  • nodejs_compat flag requirement and why it matters
  • Builder pattern and why Workers cannot use global federation instances
  • WorkersKvStore and WorkersMessageQueue usage
  • Manual queue handler wiring; queue ordering keys with orderingKv
  • wrangler configuration patterns: bindings, environments, secrets
  • Worked example of a functional ActivityPub server on Workers

Deno Deploy (docs/deploy/deno-deploy.md)

Full rewrite reflecting the recent Deno Deploy renewal. Platform-specific KV, queue, scheduling, and deployment patterns as they are today.

Security (docs/deploy/security.md)

  • HTTP Signatures verification guarantees and common pitfalls
  • Rate limiting for inbox and public collections
  • Defending against malicious payloads: oversized objects, JSON-LD expansion bombs, recursive references
  • Restricting outgoing fetches: SSRF hardening, private-network blocking
  • Key pair lifecycle: generation timing, storage, rotation, revocation, recovery
  • TLS termination and Host header trust boundaries
  • Secrets management for database credentials and private keys

Reliability (docs/deploy/reliability.md)

  • Separating web traffic from background message processing
  • Inbox idempotency and duplicate delivery handling
  • Retry, backoff, and dead-lettering strategies for outbound deliveries
  • Graceful shutdown and in-flight task draining
  • Zero-downtime deployment and migration strategies; @context or actor-key changes are especially sensitive
  • Health checks and readiness probes
  • Queue depth monitoring and back-pressure

Observability (docs/deploy/observability.md)

  • Log level and sampling strategy for production volume
  • Structured logging fields worth capturing (activity id, actor id, inbox id)
  • OpenTelemetry span conventions specific to ActivityPub operations
  • Core metrics: inbox throughput, delivery success rate, queue depth, signature verification failures
  • What to alert on, not just what to monitor
  • Tracing remote federation calls end-to-end

Deliverables

  1. Add a top-level Deployment nav entry in docs/.vitepress/config.mts. Remove the existing Deployment entry from the Manual nav.
  2. Write docs/deploy.md as the landing page with the runtime/platform selection guide.
  3. Migrate Node.js / Bun / Deno content from docs/manual/deploy.md into docs/deploy/traditional.md.
  4. Rewrite the Cloudflare Workers and Deno Deploy sections as dedicated pages (docs/deploy/cloudflare-workers.md, docs/deploy/deno-deploy.md) with expanded coverage.
  5. Write docs/deploy/security.md, docs/deploy/reliability.md, and docs/deploy/observability.md as guide-style documents, matching the tone of existing manual pages.
  6. Cross-link new pages from related manual pages (e.g. manual/log.md, manual/opentelemetry.md, manual/kv.md, manual/mq.md).
  7. Replace docs/manual/deploy.md with a redirect stub pointing at docs/deploy.md.
  8. Add an entry to CHANGES.md.

Out of scope

  • Support for new deployment platforms beyond those listed. Those belong in separate follow-up issues.
  • Reference-level API changes; this issue is documentation-only.

Metadata

Metadata

Assignees

Labels

type/documentationImprovements or additions to documentation

Type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions