Skip to content

Support pandas 3.0 #68

Open
Open
@dxdc

Description

@dxdc

The use of array_split from NumPy is now reporting as deprecated.

dfs = array_split(df_or_series, n_chunks, axis=opposite_axis)

With the latest numpy, it gives warnings, i.e.

'Series.swapaxes' is deprecated and will be removed in a future version:FutureWarning
'DataFrame.swapaxes' is deprecated and will be removed in a future version:FutureWarning

Some more details here: numpy/numpy#23217 and numpy/numpy#24889, in particular this comment: numpy/numpy#24889 (comment) for a proposed resolution.

The explanation here is that np.array_split somewhat magically works on a pandas DataFrame because the implementation of that function under the hood only uses features that happen to work the same on an array of dataframe. But, one of those is the np.swapaxes function, which when called on a DataFrame will call the swapaxes method of that DataFrame. However, this method on the DataFrame is deprecated (for a DataFrame, which is 2D it does nothing different than transpose), and so that means that once this method is removed from pandas (probably in pandas 3.0), calling np.array_split on a DataFrame will also stop working.

It's unclear if pandas (or numpy) may address this at some point?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions