Skip to content

perf(recs): O(N) hotspots, nested schemas, diagnostics → @dojoengine/recs 2.1.0#1

Open
ponderingdemocritus wants to merge 6 commits intomainfrom
ponderingdemocritus/recs-perf-review
Open

perf(recs): O(N) hotspots, nested schemas, diagnostics → @dojoengine/recs 2.1.0#1
ponderingdemocritus wants to merge 6 commits intomainfrom
ponderingdemocritus/recs-perf-review

Conversation

@ponderingdemocritus
Copy link
Copy Markdown

Summary

  • Cuts O(N) and O(N²) hotspots across Component, Indexer, Query, and overridable components. Headline: HasValue on Eternum's 216-field Resource is 30× faster (0.13 ms from 3.88 ms); removeOverride is O(K_entity) instead of O(N log N); full fixes list in packages/recs/CHANGELOG.md under the 2.1.0 entry.
  • Ports nested schema support from the previous @dojoengine/recs@2.0.13 fork (Schema = { [k]: Type | Schema }) so Dojo-generated schemas like WorldConfig.season_addresses_config type-check without casts.
  • Fixes latent silent bugs that affect Eternum today: nested-object HasValue now deep-compares (previously always returned false), BigInt-indexed components no longer crash on first write, setComponent(OverridableComponent, …) no longer throws on the Map-proxy, indexer no longer leaks empty buckets for churning unique values.
  • Adds diagnostics + soak test for long-running sessions: getDiagnostics(world) / getIndexerStats(component) for live health metrics, plus a B25 soak benchmark that asserts no linear-growth leaks across 10 churn cycles.
  • Adds a 29-benchmark harness with a committed baseline.json; B19–B24 model real Eternum component shapes (216-field Resource, 48-array ResourceArrival, BigInt-indexed Guild).
  • Adds publish tooling at packages/recs/scripts/publish-dojoengine.mjs that stages the built package as @dojoengine/recs@2.1.0 without touching the in-tree @latticexyz/recs manifest, so MUD monorepo consumers keep working unchanged.

Perf table (avg ms, Phase 0 baseline → final)

ID Hot path Before After Δ
B21 runQuery Has+HasValue(entity_id) on 200-entity 216-field Resource 3.88 0.13 −97%
B18 createLocalCache 200 writes / 1k-entity component 116.30 0.32 −99.7%
B7-5k removeOverride × 5000 on single entity 259.91 59.47 −77%
B16 setComponent skipUpdateStream × 100k 133.93 63.26 −53%
B13 getChildEntities d=4 b=10 (indexed) 7.86 3.67 −53%
B10-100k runQuery 4-Has on 100k entities 156.44 88.11 −44%
B15 defineQuery same-component in 2 fragments 7.31 4.45 −39%
B14 defineQuery proxy, 100 updates on 10k matched 1416 918 −35%

B26 (HasValue(WorldConfig, { season_addresses_config: {…} }), the nested-schema case): 0 → ~3 ops/sec at 10k lookups, because the previous behavior silently returned false on every check.

Correctness fixes (shipping regardless of perf)

  • Indexer: JSON.stringify replacer for BigInt → no more TypeError on Guild/GuildMember/Trade indexed components; key-collision fix for {x:"1/2",y:"3"} vs {x:"1",y:"2/3"}; empty-bucket GC stops the unbounded-Set-per-value leak.
  • Overridable: Map.prototype.* bound to target so setComponent(Overridable, …) works; null override honored in per-key proxy.
  • Query: nested-object deep-compare in HasValue/NotValue fixes silent misses on fresh object refs (huge for sync layers that re-emit snapshots).

New APIs

  • getDiagnostics(world){ entityCount, componentCount, components: [{ id, entitiesWithValue, indexerBuckets? }] }. Poll every N seconds to watch for linear growth in long-running sessions.
  • getIndexerStats(component){ bucketCount } for indexed components. Alert when buckets grow faster than entities.

Test plan

  • pnpm --filter @latticexyz/recs test — 82 passing (was 70)
  • pnpm --filter @latticexyz/recs test:bench — 29 passing, updates packages/recs/benchmarks/baseline.json
  • B25 soak asserts last-cycle time ≤ 3× first cycle, entityCount === 0 after all deletions, every component entitiesWithValue === 0, every indexer bucketCount === 0
  • pnpm --filter @latticexyz/react test:ci — 7 passing
  • packages/recs/scripts/publish-dojoengine.mjs dry-run emits clean @dojoengine/recs@2.1.0 tarball (12 files, 62.6 kB); real publish gated behind --yolo

Publishing

packages/recs/package.json stays @latticexyz/recs so the MUD monorepo is unaffected. scripts/publish-dojoengine.mjs stages a renamed copy and runs npm publish from the staging dir. Supports NPM_TOKEN env var (writes a staging-local .npmrc, never touches ~/.npmrc) and --otp for 2FA codes. Run from packages/recs/:

node scripts/publish-dojoengine.mjs                # dry-run
node scripts/publish-dojoengine.mjs --yolo         # real publish (ambient npm auth)
NPM_TOKEN=npm_xxx node scripts/publish-dojoengine.mjs --yolo

Commits

  • b20d58d0 perf(recs): cut O(N) hotspots in Component, Indexer, Query
  • 850c35ce perf(recs): debounce createLocalCache flushes, document Symbol.for retention
  • 878fe924 perf(recs): partial-key HasValue + deep-array equals for fat components
  • 45b94ddb fix(recs): indexer crash on BigInt fields; add BigInt-indexed benchmark
  • 80f69896 feat(recs): nested schemas, deep-compare valueEquals, diagnostics, soak bench
  • a27b1467 chore(recs): add @dojoengine/recs publish script + 2.1.0 CHANGELOG

Phase 0 lands an 18-benchmark baseline suite; Phases 1-2 fix the issues
it surfaced while preserving every public signature so react,
store-sync, and dev-tools keep working unmodified.

Indexer: schema-ordered, collision-free key (was Object.values().join('/')
which collided across e.g. {x:'1/2',y:'3'} vs {x:'1',y:'2/3'}); GC
empty buckets so unique-value components stop leaking; skip re-index
when the value didn't change; build the result Set in place.

Component: cache the primary values-Map per component so hasComponent
and entities() stop allocating Object.values()[0] on every call; skip
the prevValue read in setComponent when skipUpdateStream is set
(bulk hydration); fix the Map-proxy bug that made
setComponent(Overridable, ...) throw; honour null overrides via
'in' instead of '!= null'.

overridableComponent: secondary overridesByEntity index drops
removeOverride from O(N log N) to O(K_entity); merged entities() and
per-key keys() build the merged Set in place and the keys() proxy now
correctly excludes overrides that don't define that schema key.

defineQuery: dedupes component subscriptions (a fragment list with
the same component twice no longer doubles event volume) and pre-buckets
fragments by component id. runQuery: skips defensive [...entities]
copies via collected toDelete/toAdd, drops redundant Set-spread, and
memoises getChildEntities per call so shared proxy ancestors aren't
re-walked.

World.dispose: rewrite the cryptic filter as two explicit branches.

Phase 1+2 vs Phase 0 baseline (darwin / node 20.9.0):
- removeOverride x 5000:                 260ms -> 60ms   (-77%)
- hasComponent x 100k:                   0.37  -> 0.13   (-64%)
- setComponent skip-stream x 100k:       134   -> 63     (-53%)
- getChildEntities indexed d=4 b=10:     7.9   -> 3.7    (-53%)
- runQuery 4 Has on 100k entities:       156   -> 88     (-44%)
- defineQuery same-component dedupe:     7.3   -> 4.5    (-39%)
- defineQuery proxy 100 updates / 10k:   1416  -> 918    (-35%)

One regression: Indexer add+remove of unique values +36%, the cost of
the empty-bucket GC and JSON-keyed collision fix; trade-off documented
in benchmarks/README.md.

8 new unit tests cover the bug fixes (key collision, bucket cleanup,
removeOverride at scale, Map-proxy fix, null override, entities()
deduplication, per-key keys() filtering).

Phase 3 (localised defineQuery proxy re-evaluation) is deferred:
the naive 'use affected as initialSet' is incorrect under ProxyExpand
ripple semantics and needs a separate design pass.
…tention

createLocalCache: wrap the update$ subscription in throttleTime
(leading + trailing). The leading edge writes immediately so callers
that update once and dispose still persist; subsequent updates within
the 250ms window collapse to a single trailing write. Real-world: a
200-event burst now does 2 storage writes instead of 200.

Bench B18 (200 updates / 1k-entity component): 116ms -> 0.32ms (-99.7%
vs Phase 0 baseline). The leading write happens synchronously inside
the bench so the localStorage assertion still passes; the trailing
write fires after the bench window.

Entity: document the Symbol.for memory caveat on getEntitySymbol. The
global symbol registry is never GC'd, so a long-running client that
churns through ephemeral entity ids retains them forever. Recommend
reusing stable ids or periodically restarting.
Optimises the hot path for "god-components" like Eternum's Resource
(216 BigInt fields) and ResourceArrival (48 number arrays) without
changing their schema.

- passesQueryFragment for HasValue/NotValue now uses partialValueEquals,
  which reads only the keys present in fragment.value directly from the
  per-key value Maps. A HasValue(Resource, { entity_id: X }) check used
  to do 216 Map.get calls to materialise the full ComponentValue plus a
  key-by-key compare; it now does 1 Map.get and 1 compare.
- componentValueEquals short-circuits on reference equality and, via a
  new valueEquals helper, deep-compares array values element-wise
  instead of by reference. Fresh array refs with identical contents
  (typical of network sync layers replaying snapshots) no longer read
  as "different", fixing the latent phantom-update bug for
  ResourceArrival-shaped components.
- Five Eternum-shape benchmarks added (B19-B23) modelling the 200-player
  Dojo client workload: Resource hydration, Resource live updates with
  a subscribed HasValue query, runQuery Has+HasValue on 200 Resource
  entities, ResourceArrival phantom updates, and mixed Building tile
  lookups.

Phase 5 deltas vs pre-Phase-5 baseline:
  B21 runQuery Has + HasValue(entity_id) on 200 Resource:  3.88ms -> 0.13ms (-97%)
  B23 200 indexed Building, 20 mixed HasValue lookups:     4.17ms -> 2.69ms (-35%)
  B20 50 Resource updates w/ subscribed HasValue query:   14.07ms -> 11.22ms (-20%)
  B22 ResourceArrival 100-entity re-set (fresh array refs): 4.19ms -> 3.56ms (-15%)
  B19 Hydrate 200 Resource with skipUpdateStream:          7.59ms -> 7.39ms  (-3%)

78/78 recs unit tests, 26/26 bench, 7/7 react still pass.
Phase 1 replaced the indexer's `Object.values(v).join('/')` key with
`JSON.stringify(v)` to fix value collisions, but JSON.stringify throws
`TypeError: Do not know how to serialize a BigInt` on BigInt values.
No existing test covered it, so any `defineComponent(..., { indexed: true })`
with a Type.BigInt or Type.BigIntArray field crashed on the first write.
That broke several Dojo/Eternum indexable components (Guild.guild_id,
GuildMember.member, Trade.*_amount, etc).

Fix: pass a replacer to JSON.stringify that converts BigInts to their
decimal string. Collision-safe within a single indexer's schema because
each field has a fixed Type declaration (no BigInt/String ambiguity per
field).

Tests: two new Indexer.spec.ts cases for the Guild BigInt pattern and
for BigIntArray fields.

Bench: new B24 (500 BigInt-indexed Guild entities x 1k HasValue lookups)
captures a baseline of 436 ms / 2.29 ops/sec for the typical
"find guild by id" pattern.
…ak bench

Schema widening (ported from @dojoengine/recs fork):
  Schema is now { [k]: Type | Schema } so sub-struct fields like Eternum's
  WorldConfig.season_addresses_config type-check without casts.
  ComponentValue and Component.values recursively handle nested schemas.
  getComponentValue narrows on fieldSchema typeof === 'object' so nested
  sub-schema fields are never treated as optional types.

Deep-compare valueEquals:
  Extended the equality helper used by partialValueEquals to recurse into
  plain objects (length + key-set + per-value recursive compare). Array
  compare now also recurses. Class instances, Map, Set, Date stay on strict
  reference equality. Fixes silent HasValue failures on nested sub-struct
  fields (e.g. HasValue(WorldConfig, { season_addresses_config: {...} }))
  that the fork also has today.

Diagnostics API:
  getIndexerStats(component) returns { bucketCount } via a WeakMap
  registered at createIndexer time (no public-type pollution).
  getDiagnostics(world) returns entityCount, componentCount, and per-
  component entitiesWithValue + indexerBuckets. Intended for long-
  running-session health metrics: watch entityCount grow linearly with
  session duration to detect ephemeral-entity leaks.

Soak benchmark:
  B25: 10 cycles of create/set/query/delete over 1000 entities across
  four components (incl. BigInt-indexed Guild). Asserts last-cycle time
  stays within 3x first cycle, entityCount returns to 0, every component
  entitiesWithValue returns to 0, every indexer bucketCount returns to 0.
  CI regression guard for any future change that introduces a linear-
  growth leak.

Nested HasValue benchmark:
  B26: HasValue(WorldConfig, { season_addresses_config: {...} }) x 10k
  with a fresh object literal every call; measures the deep-compare path
  on a real nested-schema shape.

82/82 recs unit + 29/29 bench + 7/7 react still green.
scripts/publish-dojoengine.mjs stages the built package under the
@dojoengine/recs name without touching the in-tree manifest: runs the
tsup build, copies dist/ + README.md + CHANGELOG.md into a temp dir,
rewrites the manifest (name: @dojoengine/recs, version: 2.1.0,
workspace:* deps resolved to ^2.2.23, repository/homepage/bugs pointing
at github.com/dojoengine/mud), and runs `npm publish --access public`.

Defaults to --dry-run; pass --yolo to actually publish. Supports --otp
for interactive 2FA codes and NPM_TOKEN env var for granular-token auth
(written to a staging-local .npmrc, never touches ~/.npmrc).

CHANGELOG.md prepends a 2.1.0 entry describing the full delta vs
@dojoengine/recs@2.0.13: nested schema support, HasValue partial-key
fast path, deep-compare valueEquals, indexer BigInt fix + empty-bucket
GC, removeOverride O(K_entity), createLocalCache throttle, diagnostics
API, Map-proxy bug fix, and the 29-benchmark harness.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant