Add Python 3.13+ free-threading support to dict operations#369
Merged
Add Python 3.13+ free-threading support to dict operations#369
Conversation
…emRef for key_memo Two targeted cleanups to the scanner/encoder hot paths, motivated by reviewing the code that landed in #367 and comparing against CPython's _json.c and PR #344. No behavior change; just fewer dict lookups and cleaner use of the modern strong-reference dict APIs available on Python 3.13+. Scanner (_parse_object memo intern): The GetItemWithError -> Py_INCREF / PyDict_SetItem dance did two hashtable probes for every fresh key. Collapse it to a single PyDict_SetDefault (or PyDict_SetDefaultRef on 3.13+), which atomically gets-or-sets in one pass. Factored into json_memo_intern_key so the _unicode and _str template instantiations share one implementation, and the 3.13+ fast path is isolated in one place. The `memokey` temporary is gone, the loop body drops 13 lines to 6, and unique-key JSON decoding touches the memo dict half as often. Encoder (key_memo cache lookup in encoder_listencode_dict): Replace the GetItemWithError + manual Py_INCREF + PyErr_Occurred check with a call to a new json_PyDict_GetItemRef helper. On 3.13+ this forwards to PyDict_GetItemRef, which atomically returns a strong reference and eliminates the borrowed-reference window that is technically racy under free threading even under the coarse self critical section. On older Pythons the helper falls back to the legacy idiom. The caller becomes a single rc-based branch, and the Py_CLEAR(kstr) is no longer duplicated across three arms. Both changes compile cleanly under -Wall -Wextra -Wshadow -Wstrict-prototypes -Wdeclaration-after-statement -Werror on CPython 3.11, and under the default CFLAGS on CPython 3.14.0rc2 free-threaded. Full _cibw_runner suite (354 tests, C + pure-Python passes) passes on both. 16-thread x 5000-iter stress test on a shared JSONDecoder / JSONEncoder passes with the GIL disabled. Explicitly not changed: - Py_BEGIN_CRITICAL_SECTION(self) in scanner_call and encoder_call. The scanner needs it because PyDict_Clear(s->memo) at end-of-call would race with concurrent scan_once calls if we switched to a per-dict lock; the encoder uses it defensively but c_make_encoder is called fresh per JSONEncoder.iterencode() call in the normal API flow, so the lock is uncontended in practice. Fine-grained container locks (CPython-style, see PR #344 discussion) would only help the unusual case of an explicitly shared encoder across threads, and the win does not justify the refactor. https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
…ct, SKIP_WHITESPACE, field X-macros, n format Five mechanical cleanups to the C extension, none of which change behavior. Together they remove ~190 lines of duplication and close several classes of recurring bug. encoder_markers_push / encoder_markers_pop (previously duplicated in 3 places): The circular-reference marker pattern — PyLong_FromVoidPtr(obj), PyDict_Contains, PyDict_SetItem on push, and PyDict_DelItem + Py_DECREF on pop — appeared verbatim in encoder_listencode_obj, encoder_listencode_dict, and encoder_listencode_list. Three recent bug fixes (#358, #360, aa9182d) patched individual sites; factoring into two helpers collapses ~60 lines, and any future fix lives in one place. The NULL-sentinel convention on ident lets callers invoke markers_pop unconditionally on the happy path. encoder_listencode_default extraction: The inner else { ... } of encoder_listencode_obj (RawJSON + iterable fallback + markers-tracked defaultfn recursion) lived inline with nested `break` into an outer do { } while(0) and a stray indentation level from an unbraced scope. Extract it verbatim into its own function that returns 0/-1 directly, so the main dispatch loop is a clean chain of else-if arms with no `break` inside the final arm. SKIP_WHITESPACE() macro in _speedups_scan.h: `while (idx <= end_idx && IS_WHITESPACE(JSON_SCAN_READ(idx))) idx++;` appeared 8 times across _parse_object and _parse_array. Collapse to a macro defined alongside JSON_SCAN_FN / JSON_SCAN_CONCAT, #undef'd at the bottom of the template so the multi-include pattern stays hygienic. PyArg_ParseTuple "n" format code replaces _convertPyInt_AsSsize_t / _convertPyInt_FromSsize_t: The custom O& converter predates broad "n" (Py_ssize_t) support in PyArg_ParseTuple and Py_BuildValue. Both have supported "n" since Python 2.5 — the simplejson floor — so we can drop the two wrappers and use "n" directly in py_scanstring, scanner_call, encoder_call, and raise_errmsg. Saves an indirect function call per parse on three hot entry points. JSON_SCANNER_OBJECT_FIELDS / JSON_ENCODER_OBJECT_FIELDS X-macros: scanner_traverse + scanner_clear and encoder_traverse + encoder_clear all listed the same fields 2x — an easy place to forget a field when adding one (exactly the bug fixed in c23e6d9). Collapse to an X-macro field list adjacent to each struct definition, used with JSON_VISIT_FIELD / JSON_CLEAR_FIELD local expansions. Adding a new PyObject* field now needs one line in the X-macro, not two in each of four different functions. Verification: - Strict CFLAGS build on CPython 3.11: -Wall -Wextra -Wshadow -Wstrict-prototypes -Wdeclaration-after-statement -Werror, clean - Default CFLAGS build on CPython 3.14.0rc2 free-threaded: clean - Full _cibw_runner suite on both (354 tests, C + pure-Python paths): 354/354 pass - Targeted correctness tests on 3.14t: for_json / _asdict / default / iterable_as_array / RawJSON / circular detection on all three encoder sites (dict, list, default) - 16-thread x 5000-iter stress on a shared JSONDecoder and JSONEncoder with the GIL disabled: no mismatches, no races https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
…T_OBJECT_EX Two independent improvements bundled together because they touch adjacent code. #5 — encoder_dict_iteritems fast path for all-string keys: Sorted dict encoding with sort_keys=True (or a custom item_sort_key) used to go through a double-iteration loop: PyDict_Items produced a list, then the code walked it with PyIter_Next, type-checked each key, and PyList_Append'd a rebuilt list to sort. For the overwhelmingly common case of string-keyed JSON objects this was all wasted work — every tuple was kept verbatim and the "slow" rebuild list was just a duplicate of the items list. Add a fast path: if every key in the items list is already a JSON- compatible string (PyUnicode on all versions, plus PyString on Python 2), sort `items` in place via the shared encoder_sort_items_inplace helper and return iter(items). No per-item tuple reallocation, no list alloc, no stringify branch in the hot loop. On any non-string key the pre-scan bails out and falls through to the existing stringify-and-rebuild path, so the slow path is preserved exactly as before. Factored the list.sort() call into encoder_sort_items_inplace and the "is this a JSON string key" test into is_json_string_key so the two paths share one source of truth. Measured on CPython 3.14t free-threaded, 200-entry string-keyed dict with 3-element list values: sort_keys=True is now 0.204 ms/op vs 0.197 ms/op for the unsorted path — ~4% overhead, essentially just the cost of sorting itself. Previously the double-walk and list rebuild added substantial constant-factor overhead on top. #6 — T_OBJECT -> Py_T_OBJECT_EX on all member descriptors: T_OBJECT is deprecated in Python 3.12+ in favor of the new public spelling Py_T_OBJECT_EX. The semantic difference is that T_OBJECT returns Py_None when the underlying slot is NULL, while Py_T_OBJECT_EX raises AttributeError. Keep the Python-visible behavior unchanged by: 1. Defining Py_T_OBJECT_EX to T_OBJECT_EX on pre-3.12 (both available via <structmember.h>, identical semantics), so the modern spelling compiles on the full 2.5+ version range simplejson supports. 2. Switching encoder_new to store Py_None rather than NULL when encoding=None on Python 3, so the .encoding attribute still returns None (as it did under T_OBJECT) rather than raising AttributeError under Py_T_OBJECT_EX. 3. Updating the two bytes-handling sentinel checks (encoder_stringify_key and encoder_listencode_obj) from `s->encoding != NULL` to `s->encoding != Py_None` so the internal "is encoding configured" test matches the new representation. All 20 members across scanner_members and encoder_members updated in one pass. Verification: - Strict CFLAGS on CPython 3.11: -Wall -Wextra -Wshadow -Wstrict-prototypes -Wdeclaration-after-statement -Werror, clean - Default CFLAGS on CPython 3.14.0rc2 free-threaded: clean - Full _cibw_runner suite (354 tests, C + pure-Python) on both: OK - Targeted tests for encoder_dict_iteritems paths: regular dict / OrderedDict / dict subclass / empty dict / sort_keys=True with all string keys (fast path) / mixed string-and-int keys (slow path) / int keys / float keys / skipkeys+non-string / custom item_sort_key / unicode keys — all pass - encoding=None on Py3 round-trip + bytes-key rejection with encoding=None: behavior preserved - 16-thread x 5000-iter stress on shared JSONEncoder with sort_keys=True under free threading: no mismatches https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds cross-version helper functions to safely handle dictionary operations in the JSON encoder and scanner, with special support for Python 3.13+ free-threading mode. The changes ensure atomic dictionary lookups and insertions that avoid borrowed reference races under free threading, while maintaining backward compatibility with older Python versions.
Key Changes
New helper functions in
_speedups.c:json_PyDict_GetItemRef(): Atomically fetches a strong reference to a dictionary item. On Python 3.13+, uses the newPyDict_GetItemRef()API; on older versions, usesPyDict_GetItemWithError()with explicitPy_INCREF().json_memo_intern_key(): Atomically interns a key into a memo dictionary usingPyDict_SetDefaultRef()on Python 3.13+ orPyDict_SetDefault()with explicit reference management on older versions.Updated encoder logic in
encoder_listencode_dict():json_PyDict_GetItemRef()for cache lookupsPy_INCREF()pattern into a single atomic callPy_CLEAR()calls at appropriate pointsSimplified scanner logic in
_parse_object():PyDict_GetItemWithError()andPyDict_SetItem()calls with the newjson_memo_intern_key()helperImplementation Details
The changes leverage Python 3.13's new atomic dictionary APIs (
PyDict_GetItemRef()andPyDict_SetDefaultRef()) which return strong references directly, eliminating the borrowed reference race conditions that can occur under free threading. For older Python versions, the helpers fall back to the traditional borrowed-ref APIs with explicit reference management, ensuring no behavioral changes.This approach maintains full backward compatibility while providing thread-safe dictionary operations in Python 3.13+.
https://claude.ai/code/session_011EfS4WKeHCX3xPsmHuvCnz