Skip to content

Add Python 3.13+ free-threading support to dict operations#369

Merged
etrepum merged 3 commits intomasterfrom
claude/cleanup-refactoring-NOK7o
Apr 12, 2026
Merged

Add Python 3.13+ free-threading support to dict operations#369
etrepum merged 3 commits intomasterfrom
claude/cleanup-refactoring-NOK7o

Conversation

@etrepum
Copy link
Copy Markdown
Member

@etrepum etrepum commented Apr 11, 2026

Summary

This PR adds cross-version helper functions to safely handle dictionary operations in the JSON encoder and scanner, with special support for Python 3.13+ free-threading mode. The changes ensure atomic dictionary lookups and insertions that avoid borrowed reference races under free threading, while maintaining backward compatibility with older Python versions.

Key Changes

  • New helper functions in _speedups.c:

    • json_PyDict_GetItemRef(): Atomically fetches a strong reference to a dictionary item. On Python 3.13+, uses the new PyDict_GetItemRef() API; on older versions, uses PyDict_GetItemWithError() with explicit Py_INCREF().
    • json_memo_intern_key(): Atomically interns a key into a memo dictionary using PyDict_SetDefaultRef() on Python 3.13+ or PyDict_SetDefault() with explicit reference management on older versions.
  • Updated encoder logic in encoder_listencode_dict():

    • Replaced manual borrowed-ref handling with json_PyDict_GetItemRef() for cache lookups
    • Simplified error handling by consolidating the borrowed-ref + Py_INCREF() pattern into a single atomic call
    • Improved reference counting clarity with explicit Py_CLEAR() calls at appropriate points
  • Simplified scanner logic in _parse_object():

    • Replaced separate PyDict_GetItemWithError() and PyDict_SetItem() calls with the new json_memo_intern_key() helper
    • Reduced code complexity while maintaining the same key interning behavior

Implementation Details

The changes leverage Python 3.13's new atomic dictionary APIs (PyDict_GetItemRef() and PyDict_SetDefaultRef()) which return strong references directly, eliminating the borrowed reference race conditions that can occur under free threading. For older Python versions, the helpers fall back to the traditional borrowed-ref APIs with explicit reference management, ensuring no behavioral changes.

This approach maintains full backward compatibility while providing thread-safe dictionary operations in Python 3.13+.

https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz

claude added 3 commits April 11, 2026 19:42
…emRef for key_memo

Two targeted cleanups to the scanner/encoder hot paths, motivated by
reviewing the code that landed in #367 and comparing against CPython's
_json.c and PR #344. No behavior change; just fewer dict lookups and
cleaner use of the modern strong-reference dict APIs available on
Python 3.13+.

Scanner (_parse_object memo intern):
  The GetItemWithError -> Py_INCREF / PyDict_SetItem dance did two
  hashtable probes for every fresh key. Collapse it to a single
  PyDict_SetDefault (or PyDict_SetDefaultRef on 3.13+), which atomically
  gets-or-sets in one pass. Factored into json_memo_intern_key so the
  _unicode and _str template instantiations share one implementation,
  and the 3.13+ fast path is isolated in one place. The `memokey`
  temporary is gone, the loop body drops 13 lines to 6, and unique-key
  JSON decoding touches the memo dict half as often.

Encoder (key_memo cache lookup in encoder_listencode_dict):
  Replace the GetItemWithError + manual Py_INCREF + PyErr_Occurred check
  with a call to a new json_PyDict_GetItemRef helper. On 3.13+ this
  forwards to PyDict_GetItemRef, which atomically returns a strong
  reference and eliminates the borrowed-reference window that is
  technically racy under free threading even under the coarse self
  critical section. On older Pythons the helper falls back to the
  legacy idiom. The caller becomes a single rc-based branch, and the
  Py_CLEAR(kstr) is no longer duplicated across three arms.

Both changes compile cleanly under -Wall -Wextra -Wshadow
-Wstrict-prototypes -Wdeclaration-after-statement -Werror on CPython
3.11, and under the default CFLAGS on CPython 3.14.0rc2 free-threaded.
Full _cibw_runner suite (354 tests, C + pure-Python passes) passes on
both. 16-thread x 5000-iter stress test on a shared JSONDecoder /
JSONEncoder passes with the GIL disabled.

Explicitly not changed:
- Py_BEGIN_CRITICAL_SECTION(self) in scanner_call and encoder_call.
  The scanner needs it because PyDict_Clear(s->memo) at end-of-call
  would race with concurrent scan_once calls if we switched to a
  per-dict lock; the encoder uses it defensively but c_make_encoder
  is called fresh per JSONEncoder.iterencode() call in the normal
  API flow, so the lock is uncontended in practice. Fine-grained
  container locks (CPython-style, see PR #344 discussion) would only
  help the unusual case of an explicitly shared encoder across
  threads, and the win does not justify the refactor.

https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
…ct, SKIP_WHITESPACE, field X-macros, n format

Five mechanical cleanups to the C extension, none of which change
behavior. Together they remove ~190 lines of duplication and close
several classes of recurring bug.

encoder_markers_push / encoder_markers_pop (previously duplicated in 3
places):
  The circular-reference marker pattern — PyLong_FromVoidPtr(obj),
  PyDict_Contains, PyDict_SetItem on push, and PyDict_DelItem +
  Py_DECREF on pop — appeared verbatim in encoder_listencode_obj,
  encoder_listencode_dict, and encoder_listencode_list. Three recent
  bug fixes (#358, #360, aa9182d) patched individual sites; factoring
  into two helpers collapses ~60 lines, and any future fix lives in
  one place. The NULL-sentinel convention on ident lets callers invoke
  markers_pop unconditionally on the happy path.

encoder_listencode_default extraction:
  The inner else { ... } of encoder_listencode_obj (RawJSON + iterable
  fallback + markers-tracked defaultfn recursion) lived inline with
  nested `break` into an outer do { } while(0) and a stray indentation
  level from an unbraced scope. Extract it verbatim into its own
  function that returns 0/-1 directly, so the main dispatch loop is a
  clean chain of else-if arms with no `break` inside the final arm.

SKIP_WHITESPACE() macro in _speedups_scan.h:
  `while (idx <= end_idx && IS_WHITESPACE(JSON_SCAN_READ(idx))) idx++;`
  appeared 8 times across _parse_object and _parse_array. Collapse to
  a macro defined alongside JSON_SCAN_FN / JSON_SCAN_CONCAT, #undef'd
  at the bottom of the template so the multi-include pattern stays
  hygienic.

PyArg_ParseTuple "n" format code replaces _convertPyInt_AsSsize_t /
_convertPyInt_FromSsize_t:
  The custom O& converter predates broad "n" (Py_ssize_t) support in
  PyArg_ParseTuple and Py_BuildValue. Both have supported "n" since
  Python 2.5 — the simplejson floor — so we can drop the two wrappers
  and use "n" directly in py_scanstring, scanner_call, encoder_call,
  and raise_errmsg. Saves an indirect function call per parse on three
  hot entry points.

JSON_SCANNER_OBJECT_FIELDS / JSON_ENCODER_OBJECT_FIELDS X-macros:
  scanner_traverse + scanner_clear and encoder_traverse +
  encoder_clear all listed the same fields 2x — an easy place to
  forget a field when adding one (exactly the bug fixed in c23e6d9).
  Collapse to an X-macro field list adjacent to each struct
  definition, used with JSON_VISIT_FIELD / JSON_CLEAR_FIELD local
  expansions. Adding a new PyObject* field now needs one line in the
  X-macro, not two in each of four different functions.

Verification:
- Strict CFLAGS build on CPython 3.11: -Wall -Wextra -Wshadow
  -Wstrict-prototypes -Wdeclaration-after-statement -Werror, clean
- Default CFLAGS build on CPython 3.14.0rc2 free-threaded: clean
- Full _cibw_runner suite on both (354 tests, C + pure-Python paths):
  354/354 pass
- Targeted correctness tests on 3.14t: for_json / _asdict / default /
  iterable_as_array / RawJSON / circular detection on all three
  encoder sites (dict, list, default)
- 16-thread x 5000-iter stress on a shared JSONDecoder and
  JSONEncoder with the GIL disabled: no mismatches, no races

https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
…T_OBJECT_EX

Two independent improvements bundled together because they touch
adjacent code.

#5 — encoder_dict_iteritems fast path for all-string keys:

  Sorted dict encoding with sort_keys=True (or a custom item_sort_key)
  used to go through a double-iteration loop: PyDict_Items produced
  a list, then the code walked it with PyIter_Next, type-checked each
  key, and PyList_Append'd a rebuilt list to sort. For the
  overwhelmingly common case of string-keyed JSON objects this was
  all wasted work — every tuple was kept verbatim and the "slow"
  rebuild list was just a duplicate of the items list.

  Add a fast path: if every key in the items list is already a JSON-
  compatible string (PyUnicode on all versions, plus PyString on
  Python 2), sort `items` in place via the shared
  encoder_sort_items_inplace helper and return iter(items). No
  per-item tuple reallocation, no list alloc, no stringify branch
  in the hot loop.

  On any non-string key the pre-scan bails out and falls through to
  the existing stringify-and-rebuild path, so the slow path is
  preserved exactly as before. Factored the list.sort() call into
  encoder_sort_items_inplace and the "is this a JSON string key"
  test into is_json_string_key so the two paths share one source of
  truth.

  Measured on CPython 3.14t free-threaded, 200-entry string-keyed
  dict with 3-element list values: sort_keys=True is now 0.204 ms/op
  vs 0.197 ms/op for the unsorted path — ~4% overhead, essentially
  just the cost of sorting itself. Previously the double-walk and
  list rebuild added substantial constant-factor overhead on top.

#6 — T_OBJECT -> Py_T_OBJECT_EX on all member descriptors:

  T_OBJECT is deprecated in Python 3.12+ in favor of the new public
  spelling Py_T_OBJECT_EX. The semantic difference is that
  T_OBJECT returns Py_None when the underlying slot is NULL, while
  Py_T_OBJECT_EX raises AttributeError. Keep the Python-visible
  behavior unchanged by:

  1. Defining Py_T_OBJECT_EX to T_OBJECT_EX on pre-3.12 (both
     available via <structmember.h>, identical semantics), so the
     modern spelling compiles on the full 2.5+ version range
     simplejson supports.
  2. Switching encoder_new to store Py_None rather than NULL when
     encoding=None on Python 3, so the .encoding attribute still
     returns None (as it did under T_OBJECT) rather than raising
     AttributeError under Py_T_OBJECT_EX.
  3. Updating the two bytes-handling sentinel checks
     (encoder_stringify_key and encoder_listencode_obj) from
     `s->encoding != NULL` to `s->encoding != Py_None` so the
     internal "is encoding configured" test matches the new
     representation.

  All 20 members across scanner_members and encoder_members updated
  in one pass.

Verification:

- Strict CFLAGS on CPython 3.11: -Wall -Wextra -Wshadow
  -Wstrict-prototypes -Wdeclaration-after-statement -Werror, clean
- Default CFLAGS on CPython 3.14.0rc2 free-threaded: clean
- Full _cibw_runner suite (354 tests, C + pure-Python) on both: OK
- Targeted tests for encoder_dict_iteritems paths: regular dict /
  OrderedDict / dict subclass / empty dict / sort_keys=True with all
  string keys (fast path) / mixed string-and-int keys (slow path) /
  int keys / float keys / skipkeys+non-string / custom item_sort_key
  / unicode keys — all pass
- encoding=None on Py3 round-trip + bytes-key rejection with
  encoding=None: behavior preserved
- 16-thread x 5000-iter stress on shared JSONEncoder with
  sort_keys=True under free threading: no mismatches

https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
@etrepum etrepum added this pull request to the merge queue Apr 12, 2026
Merged via the queue into master with commit e2e5f0b Apr 12, 2026
36 checks passed
@etrepum etrepum deleted the claude/cleanup-refactoring-NOK7o branch April 12, 2026 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants