Introduction to NumPy Random Module
When working with data, randomness isn’t chaos — it's opportunity. The numpy.random
module helps us generate random numbers, simulate events, create reproducible experiments, and more. Whether you're training a machine learning model or simulating dice rolls, understanding this module is essential.
Importing the Random Module
import numpy as np
You don't import the random module separately — it's part of NumPy. So once NumPy is imported, np.random
becomes accessible.
Generating Random Numbers
1. np.random.rand()
Generates random floats in the half-open interval [0.0, 1.0). You can pass in one or more dimensions.
np.random.rand(3)
Output: An array of 3 random float numbers between 0 and 1.
2. np.random.randint()
Generates random integers from a low (inclusive) to a high (exclusive) range.
np.random.randint(1, 10, size=5)
Output: An array of 5 random integers between 1 and 9.
3. np.random.randn()
Generates samples from the "standard normal" distribution (mean = 0, std = 1).
np.random.randn(4)
Output: 4 numbers centered around 0 (some may be negative or above 1).
Random Choice: Simulating Real-World Events
Want to simulate a dice roll? A lottery pick? Use np.random.choice()
.
np.random.choice([1, 2, 3, 4, 5, 6], size=3)
Output: Randomly selects 3 values from the list.
With Probability
np.random.choice(['red', 'blue'], p=[0.8, 0.2], size=10)
Output: An array where 'red' appears roughly 80% of the time.
Seeding: Making Random Predictable
To ensure your code gives the same random result every time (useful in debugging), use np.random.seed()
.
np.random.seed(42)
print(np.random.rand(3))
Output: Always gives the same three numbers after setting the seed.
Shuffling Data
Randomly shuffle the contents of an array using np.random.shuffle()
.
arr = np.array([1, 2, 3, 4])
np.random.shuffle(arr)
print(arr)
Output: Elements of arr
rearranged in a random order. Note: shuffle()
modifies the array in-place.
Common Pitfalls & Checks
- Seed properly: If you need reproducible results, set the seed before each random operation.
- Correct range: Remember that
randint()
is exclusive on the upper bound. - Probability sums: When using
p=
inchoice()
, ensure the sum of all probabilities equals 1. - Data type mismatch:
randint()
returns int;rand()
returns float.
Verification Tip
To test randomness, run the same function multiple times with and without setting a seed. You’ll see the difference in reproducibility.
np.random.seed(100)
print(np.random.randint(1, 5, size=4))
# Run again with same seed to verify
np.random.seed(100)
print(np.random.randint(1, 5, size=4))
Output: Both arrays will match exactly. If you omit the seed, the arrays will differ each time.
Conclusion
The numpy.random
module is more than just a tool for chaos. It's your gateway to controlled randomness in simulations, modeling, and algorithm development. With power comes responsibility — always be mindful of seeding and reproducibility.
Next up, we'll look at sorting, searching, and data exploration using NumPy utilities.