Skip to content

memoryview.cast("?") has undocumented undefined behavior #148390

@picnixz

Description

@picnixz

Bug report

Bug description:

I am opening an issue to follow-up #83870 and https://discuss.python.org/t/behavior-of-struct-format-native-bool/3774 while I was investigating #148286. Here is the problem:

>>> import struct
>>> struct.pack("????", 0, 1, 2, 3)
b'\x00\x01\x01\x01'
>>> struct.unpack("????", b"\0\1\2\3")
(False, True, True, True)

For struct.[un]pack, we see that there is a conversion from bool(object) to \x00 or \x01 as well as from \x00 to False and anything else that is 1 byte to True. The docs say:

For the '?' format character, the return value is either True or False. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.

but as the DPO thread says, it may be confusing. I think it makes more sense to say that packing to ? (for Python) is more or less equivalent to cast it to bool and then to the C _Bool (in Python, there is a well-defined conversion from any object to our own bool type).

However, for memoryview, we have the following:

>>> memoryview(b'\0\1\2\3').cast("?").tolist()
[False, True, False, True]

In particular, there is a reduction modulo $2$ for memory view items instead. There is however an undefined behavior here because tolist() performs the following on each memoryview's item:

_Bool c;
memcpy((char *)&c, ptr, sizeof _Bool);

The above is an undefined behavior. I think memoryview.cast(fmt).tolist() should behave as if we were doing struct.unpack(fmt * len(mv), mv.obj) instead. If this is not desired, we should at least document this discrepancy. Unfortunately, PEP-3118 does not address this (AFAIK).

cc @encukou @vstinner @serhiy-storchaka @ngoldbaum

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Metadata

Metadata

Assignees

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)topic-C-APItype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions