Bug report
Bug description:
I am opening an issue to follow-up #83870 and https://discuss.python.org/t/behavior-of-struct-format-native-bool/3774 while I was investigating #148286. Here is the problem:
>>> import struct
>>> struct.pack("????", 0, 1, 2, 3)
b'\x00\x01\x01\x01'
>>> struct.unpack("????", b"\0\1\2\3")
(False, True, True, True)
For struct.[un]pack, we see that there is a conversion from bool(object) to \x00 or \x01 as well as from \x00 to False and anything else that is 1 byte to True. The docs say:
For the '?' format character, the return value is either True or False. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be True when unpacking.
but as the DPO thread says, it may be confusing. I think it makes more sense to say that packing to ? (for Python) is more or less equivalent to cast it to bool and then to the C _Bool (in Python, there is a well-defined conversion from any object to our own bool type).
However, for memoryview, we have the following:
>>> memoryview(b'\0\1\2\3').cast("?").tolist()
[False, True, False, True]
In particular, there is a reduction modulo $2$ for memory view items instead. There is however an undefined behavior here because tolist() performs the following on each memoryview's item:
_Bool c;
memcpy((char *)&c, ptr, sizeof _Bool);
The above is an undefined behavior. I think memoryview.cast(fmt).tolist() should behave as if we were doing struct.unpack(fmt * len(mv), mv.obj) instead. If this is not desired, we should at least document this discrepancy. Unfortunately, PEP-3118 does not address this (AFAIK).
cc @encukou @vstinner @serhiy-storchaka @ngoldbaum
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response
Bug report
Bug description:
I am opening an issue to follow-up #83870 and https://discuss.python.org/t/behavior-of-struct-format-native-bool/3774 while I was investigating #148286. Here is the problem:
For
struct.[un]pack, we see that there is a conversion frombool(object)to\x00or\x01as well as from\x00toFalseand anything else that is 1 byte toTrue. The docs say:but as the DPO thread says, it may be confusing. I think it makes more sense to say that packing to
?(for Python) is more or less equivalent to cast it tobooland then to the C_Bool(in Python, there is a well-defined conversion from any object to our ownbooltype).However, for
memoryview, we have the following:In particular, there is a reduction modulo$2$ for memory view items instead. There is however an undefined behavior here because
tolist()performs the following on each memoryview's item:The above is an undefined behavior. I think
memoryview.cast(fmt).tolist()should behave as if we were doingstruct.unpack(fmt * len(mv), mv.obj)instead. If this is not desired, we should at least document this discrepancy. Unfortunately, PEP-3118 does not address this (AFAIK).cc @encukou @vstinner @serhiy-storchaka @ngoldbaum
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response