Skip to content

[Python] support for complex64 and complex128 as primitive types for zero-copy interop with numpy #39753

@maupardh1

Description

@maupardh1

Describe the enhancement requested

Hello Arrow team,

First, I love Arrow - thank you so much for making this great project.

I am manipulating multi-dimensional array data (time-series like) that is produced from sensors as numpy arrays of type complex64. I would like to manipulate them in arrow for recording (feather and/or parquet formats) and distributed computing in the future (cu-df, dask, spark - most likely frameworks on top of arrow, but also cupy/scipy). This would also allow me to write column names and other schema metadata. I think it could be superior (and faster) than manipulating numpy arrays.

It would be great to just call pa.array(np.array([1 + 2 * 1j, 3 + 4 * 1j], dtype=np.complex64), type=pa.complex64()) but that type doesn't exist in Arrow. I haven't found a way to zero-copy a complex64 numpy array into a pyarrow array (my understanding is that only primitive types support zero-copy between arrow and numpy, and pa.binary(8) or struct attempts on top of numpy views so far have resulted in copies). I would also need to read it back from a feather/parquet format and potentially convert it to a numpy array if needed, and land back on np.complex64.

I think this has come up one or twice already: I found this thread: https://www.mail-archive.com/dev@arrow.apache.org/msg23352.html and this PR: #10452 and thought I would also +1 this request, just in case.

If no first class support, do you see an alternative way to get zero-copy behavior?

Thanks!

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions