Introduction
In NumPy, manipulating the dimensionality of arrays unlocks powerful ways to reshape and adapt data for machine learning models, matrix operations, and broadcasting. Whether you're expanding a 1D array into a column or flattening an array for simplified computations, knowing how to add or remove dimensions is essential.
Why Adjust Array Dimensions?
Sometimes, your data comes in a shape that's not compatible with operations like matrix multiplication, broadcasting, or stacking. That’s when we need to intentionally add or remove dimensions to make our data shape-ready.
Adding Dimensions: The Right Way
There are two intuitive methods to increase the dimensions of an array:
1. Using np.newaxis
This is a special keyword that allows you to insert a new axis (i.e., dimension) at a specific position.
import numpy as np
arr = np.array([10, 20, 30])
print(arr.shape) # (3,)
# Add a new axis to make it a 2D column vector
col_vector = arr[:, np.newaxis]
print(col_vector)
print(col_vector.shape) # (3, 1)
Explanation:
The original array is 1D with shape (3,)
. By using np.newaxis
, we inserted a new axis at position 1 (after the row), turning it into a column vector with shape (3, 1)
.
2. Using np.expand_dims()
This method is more readable in codebases and works similarly to np.newaxis
.
row_vector = np.expand_dims(arr, axis=0)
print(row_vector)
print(row_vector.shape) # (1, 3)
Explanation:
np.expand_dims
adds a new dimension at the specified axis. Here, axis=0
means we are converting the 1D array into a 2D row vector.
Removing Dimensions: Flatten It Out
Sometimes, arrays have extra dimensions that we don’t need—like when you load grayscale image data that returns shape (28, 28, 1)
. To remove such dimensions, use:
1. Using np.squeeze()
arr_3d = np.array([[[5], [10], [15]]]) # Shape: (1, 3, 1)
print(arr_3d.shape)
squeezed = np.squeeze(arr_3d)
print(squeezed)
print(squeezed.shape) # (3,)
Explanation:
np.squeeze
removes all axes with size 1. It’s perfect when you want to collapse unnecessary singleton dimensions in a structured array.
Be Careful With Squeeze
If you accidentally squeeze an axis you need later, it might break broadcasting. If you want to remove a specific axis, use:
np.squeeze(arr_3d, axis=0) # Only removes axis 0 if it has size 1
Real-Life Use Case
When preparing data for machine learning, especially in libraries like TensorFlow or PyTorch, it’s common to expand dimensions to fit batch processing or flatten arrays before feeding them into dense layers.
Checklist for Dimensional Manipulation
- Use
np.newaxis
orexpand_dims
to reshape 1D to 2D. - Use
squeeze
only when you're sure the axis size is 1. - Always verify shapes using
array.shape
after modification. - Avoid reshaping blindly—check if it's consistent with your downstream operations.
Verification Tips
After any dimensional operation, always check:
print(arr.shape)
print(arr.ndim)
This helps confirm the array’s dimensional integrity before applying any further computations.
Summary
Whether you're building machine learning pipelines, processing image data, or preparing matrices, mastering how to add and remove dimensions will make your NumPy workflow more flexible and powerful.
These tools aren't just conveniences—they’re fundamentals. With just a few lines, you unlock shapes that seamlessly fit into the broader universe of numerical computing.