Yandex

Course IndexCourse Index0

    ProgramGuru

    Detecting NaNs in NumPy


    Introduction: What Are NaNs in NumPy?

    In the world of numerical computing, you’ll occasionally stumble upon a strange beast — NaN, short for Not a Number. These values often appear when operations fail or data is missing, and ignoring them can lead to unexpected results.

    This tutorial will show you how to detect NaN values in NumPy arrays, step by step, with practical examples and checks to ensure your data stays clean and trustworthy.

    Step 1: Why NaNs Matter

    NaNs represent undefined or missing values. They might sneak into your dataset from:

    • Corrupted CSV files
    • Invalid mathematical operations (e.g., 0/0)
    • Manual data entry errors
    • APIs that return incomplete data

    If left unchecked, NaNs can poison your calculations. Summing, averaging, or comparing arrays with NaNs can lead to misleading or silently broken logic.

    Step 2: Detecting NaNs with np.isnan()

    NumPy provides a direct way to find NaNs using np.isnan(). It returns a boolean array of the same shape, marking True wherever a NaN is present.

    
    import numpy as np
    
    arr = np.array([1.5, 2.0, np.nan, 4.5, np.nan])
    nan_mask = np.isnan(arr)
    print(nan_mask)
    [False False  True False  True]

    Step 3: Verifying with Conditional Count

    You can verify the count of NaNs using np.sum() along with the mask:

    
    nan_count = np.sum(nan_mask)
    print(f"Total NaNs: {nan_count}")
    Total NaNs: 2

    This simple pattern — create a mask, sum the True values — is extremely powerful and scalable.

    Step 4: Finding Indices of NaNs

    To find where exactly the NaNs reside, use np.where():

    
    nan_indices = np.where(nan_mask)
    print("Indices with NaN:", nan_indices)
    Indices with NaN: (array([2, 4]),)

    Step 5: Using NaNs in Multidimensional Arrays

    Detection works just as well with 2D or 3D arrays. Here's an example with a 2D matrix:

    
    matrix = np.array([[1.0, 2.0], [np.nan, 3.5]])
    print(np.isnan(matrix))
    [[False False]
     [ True False]]
    

    Step 6: Common Mistakes to Avoid

    • Never use == np.nan to check for NaNs. It always returns False due to how NaNs behave in floating-point logic.
    • Always prefer np.isnan() over manual loops — it's faster, more readable, and tested.
    • If you're chaining operations, apply NaN checks early to avoid polluted results down the pipeline.

    Quick Recap

    Let’s bring it all together:

    • Use np.isnan() to create a mask of NaNs
    • Use np.sum() to count them
    • Use np.where() to locate them

    Next Steps

    Now that you can confidently detect NaNs, the next logical step is to handle or replace them. Continue to the next tutorial where we'll explore np.nan_to_num(), np.isnan() with masking, and clean-up strategies that prepare your data for analysis.



    Welcome to ProgramGuru

    Sign up to start your journey with us

    Support ProgramGuru.org

    You can support this website with a contribution of your choice.

    When making a contribution, mention your name, and programguru.org in the message. Your name shall be displayed in the sponsors list.

    PayPal

    UPI

    PhonePe QR

    MALLIKARJUNA M