NumPy 2.xx.x Release Notes#

Highlights#

We’ll choose highlights for this release near the end of the release cycle.

Deprecations#

Setting the `strides` attribute is deprecated#

Setting the strides attribute is now deprecated since mutating an array is unsafe if an array is shared, especially by multiple threads. As an alternative, you can create a new view (no copy) via: * np.lib.stride_tricks.strided_window_view if applicable, * np.lib.stride_tricks.as_strided for the general case, * or the np.ndarray constructor (buffer is the original array) for a light-weight version.

(gh-28925)

`align=` must be passed as boolean to `np.dtype()`#

When creating a new dtype a VisibleDeprecationWarning will be given if align= is not a boolean. This is mainly to prevent accidentally passing a subarray align flag where it has no effect, such as np.dtype("f8", 3) instead of np.dtype(("f8", 3)). We strongly suggest to always pass align= as a keyword argument.

(gh-29301)

Compatibility notes#

NumPy’s C extension modules have begun to use multi-phase initialisation, as defined by PEP 489. As part of this, a new explicit check has been added that each such module is only imported once per Python process. This comes with the side-effect that deleting numpy from sys.modules and re-importing it will now fail with an ImportError. This has always been unsafe, with unexpected side-effects, though did not previously raise an error.

(gh-29030)

The Macro NPY_ALIGNMENT_REQUIRED has been removed#

The macro was defined in the npy_cpu.h file, so might be regarded as semipublic. As it turns out, with modern compilers and hardware it is almost always the case that alignment is required, so numpy no longer uses the macro. It is unlikely anyone uses it, but you might want to compile with the -Wundef flag or equivalent to be sure.

(gh-29094)

New Features#

Let np.size accept multiple axes.

(gh-29240)

Improvements#

Fix `flatiter` indexing edge cases#

The flatiter object now shares the same index preparation logic as ndarray, ensuring consistent behavior and fixing several issues where invalid indices were previously accepted or misinterpreted.

Key fixes and improvements:

Stricter index validation
- Boolean non-array indices like arr.flat[[True, True]] were incorrectly treated as arr.flat[np.array([1, 1], dtype=int)]. They now raise an index error. Note that indices that match the iterator’s shape are expected to not raise in the future and be handled as regular boolean indices. Use np.asarray(<index>) if you want to match that behavior.
- Float non-array indices were also cast to integer and incorrectly treated as arr.flat[np.array([1.0, 1.0], dtype=int)]. This is now deprecated and will be removed in a future version.
- 0-dimensional boolean indices like arr.flat[True] are also deprecated and will be removed in a future version.
Consistent error types:

Certain invalid flatiter indices that previously raised ValueError now correctly raise IndexError, aligning with ndarray behavior.
Improved error messages:

The error message for unsupported index operations now provides more specific details, including explicitly listing the valid index types, instead of the generic IndexError: unsupported index operation.

(gh-28590)

Improved error message for assert_array_compare#

The error message generated by assert_array_compare which is used by functions like assert_allclose, assert_array_less etc. now also includes information about the indices at which the assertion fails.

(gh-29112)

Show unit information in `repr` for `datetime64("NaT")`#

When a datetime64 object is “Not a Time” (NaT), its __repr__ method now includes the time unit of the datetime64 type. This makes it consistent with the behavior of a timedelta64 object.

(gh-29396)

Performance improvements and changes#

Performance improvements to `np.unique` for string dtypes#

The hash-based algorithm for unique extraction provides an order-of-magnitude speedup on large string arrays. In an internal benchmark with about 1 billion string elements, the hash-based np.unique completed in roughly 33.5 seconds, compared to 498 seconds with the sort-based method – about 15× faster for unsorted unique operations on strings. This improvement greatly reduces the time to find unique values in very large string datasets.

(gh-28767)

Changes#

Multiplication between a string and integer now raises OverflowError instead of MemoryError if the result of the multiplication would create a string that is too large to be represented. This follows Python’s behavior.

(gh-29060)

`unique_values` for string dtypes may return unsorted data#

np.unique now supports hash‐based duplicate removal for string dtypes. This enhancement extends the hash-table algorithm to byte strings (‘S’), Unicode strings (‘U’), and the experimental string dtype (‘T’, StringDType). As a result, calling np.unique() on an array of strings will use the faster hash-based method to obtain unique values. Note that this hash-based method does not guarantee that the returned unique values will be sorted. This also works for StringDType arrays containing None (missing values) when using equal_nan=True (treating missing values as equal).

(gh-28767)

Fix bug in `matmul` for non-contiguous out kwarg parameter#

In some cases, if out was non-contiguous, np.matmul would cause memory corruption or a c-level assert. This was new to v2.3.0 and fixed in v2.3.1.

(gh-29179)

`__array_interface__` with NULL pointer changed#

The array interface now accepts NULL pointers (NumPy will do its own dummy allocation, though). Previously, these incorrectly triggered an undocumented scalar path. In the unlikely event that the scalar path was actually desired, you can (for now) achieve the previous behavior via the correct scalar path by not providing a data field at all.

(gh-29338)

NumPy 2.xx.x Release Notes#

Highlights#

Deprecations#

Setting the strides attribute is deprecated#

align= must be passed as boolean to np.dtype()#