Boolean Indexing in NumPy
In real data analysis, you often need to filter data based on conditions. For example, selecting values greater than a threshold, removing invalid values, or extracting rows that meet specific criteria.
NumPy provides Boolean Indexing to perform such filtering efficiently.
What Is Boolean Indexing?
Boolean indexing uses a boolean condition (True or False) to select elements from an array.
The condition is applied element-wise, and only values that satisfy the condition are returned.
Basic Boolean Indexing Example
Let’s start with a simple numeric array.
import numpy as np
arr = np.array([5, 12, 18, 25, 30])
print(arr[arr > 15])
Output:
[18 25 30]
Here, the condition arr > 15 filters only values greater than 15.
Understanding the Boolean Mask
Behind the scenes, NumPy creates a boolean array (mask).
mask = arr > 15
print(mask)
Output:
[False False True True True]
NumPy then uses this mask to extract only the values where the condition is True.
Using Multiple Conditions
You can combine multiple conditions using logical operators:
&– AND|– OR~– NOT
print(arr[(arr > 10) & (arr < 30)])
Output:
[12 18 25]
Parentheses are mandatory when combining conditions.
Boolean Indexing with Two-Dimensional Arrays
Boolean indexing works the same way with 2D arrays.
matrix = np.array([[10, 20, 30],
[5, 15, 25],
[0, 8, 40]])
print(matrix[matrix >= 20])
Output:
[20 30 25 40]
The condition is applied to every element across all rows and columns.
Replacing Values Using Boolean Indexing
Boolean indexing is also useful for modifying data.
arr[arr < 15] = 0
print(arr)
Output:
[ 0 0 18 25 30]
This technique is commonly used in data cleaning and preprocessing.
Real-World Use Case Example
Imagine a dataset of exam scores where values below 40 are considered failed.
scores = np.array([78, 45, 32, 90, 55, 28])
passed = scores[scores >= 40]
failed = scores[scores < 40]
print(passed)
print(failed)
Output:
[78 45 90 55]
[32 28]
This shows how Boolean indexing simplifies conditional data selection.
Practice Exercise
Exercise
Create a NumPy array of 10 random integers between 1 and 100 and:
- Select values greater than 50
- Select values between 20 and 70
- Replace values below 30 with 0
Expected Outcome
You should be able to filter and modify arrays using conditions confidently.
What’s Next?
In the next lesson, you will learn about Basic Operations, including arithmetic operations and comparisons in NumPy.