Track
In the world of data science and analytics, encountering missing data is more a rule than an exception. Missing values can skew analysis, lead to incorrect conclusions, and generally disrupt the flow of data processing. Addressing these gaps is crucial for maintaining the integrity of your analysis. This article aims to equip you with different ways of identifying NaN (Not a Number) values in Python.
The Short Answer: Use either NumPy’s isnan()
function or Pandas .isna()
method
When dealing with missing values in Python, the approach largely depends on the data structure you're working with.
For Single Values or Arrays: Use NumPy
NumPy's isnan()
function is ideal for identifying NaNs in numeric arrays or single values, offering a straightforward and efficient solution. Here it is in action!
import numpy as np
# Single value check
my_missing_value = np.nan
print(np.isnan(my_missing_value)) # Output: True
# Array check
my_missing_array = np.array([1, np.nan, 3])
nan_array = np.isnan(my_missing_array)
print(nan_array) # Output: [False True False]
For DataFrames: Use Pandas
Pandas provides comprehensive methods like .isna()
and .isnull()
to detect missing values across DataFrame or Series objects, seamlessly integrating with data analysis workflows.
import pandas as pd
import numpy as np
my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})
print(my_dataframe.isna())
When you run this code, the output will indicate the presence of NaN values in a more interesting context, as shown below:
Column1 Column2
0 False False
1 False True
2 True False
The Difference Between NaN
and None
Understanding the distinction between NaN
and None
is crucial in Python. NaN
is a floating-point representation of "Not a Number," used primarily in numerical computations. None
, on the other hand, is Python's object representing the absence of a value akin to null in other languages. While NaN
is used in mathematical or scientific computations, None is more general-purpose, indicating the lack of data.
4 Ways to Check for NaN in Python
Navigating through datasets to identify missing values is a critical step in data preprocessing. Let's explore four practical methods to check for NaN
values in Python, continuing with the engaging examples we've already used.
1. Checking for NaN using np.isnan()
As we saw earlier, NumPy provides a straightforward approach to identifying NaN
values in both single values and arrays, which is essential for numerical data analysis.
import numpy as np
# Checking a single value
print(np.isnan(np.nan)) # Output: True
# Checking an array
my_array = np.array([1, 5, np.nan])
print(np.isnan(my_array)) # Output: [False False True]
2. Checking for NaN
using pd.isna()
Pandas simplifies detecting NaN values in data structures, from scalars to complex DataFrames, making it invaluable for data manipulation tasks.
import pandas as pd
# Checking a single value
print(pd.isna(np.nan)) # Output: True
# Checking a pandas Series
my_series = pd.Series(["Python", np.nan, "The Best"])
print(my_series.isna()) # Output: [False True False]
# Checking a pandas DataFrame
my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})
print(pd.isna(my_dataframe)) # Output a DataFrame with True for missing values
3. Checking for NaN
in DataFrames using Pandas .isna()
or .isnull()
methods
Pandas DataFrames also offer the .isna()
and .isnull()
methods to effortlessly pinpoint missing values across datasets, providing a clear overview of data completeness.
import pandas as pd
# Create a dataframe with missing values
my_dataframe = pd.DataFrame({
'Column1': ["I", "Love", np.nan],
'Column2': ["Python", np.nan, "The Best"]
})
print(my_dataframe.isna())
# Output:
# Column1 Column2
# 0 False False
# 1 False True
# 2 True False
print(my_dataframe.isnull())
# Output:
# Column1 Column2
# 0 False False
# 1 False True
# 2 True False
4. Checking for NaN
in DataFrames using math.isnan()
For individual number checks, the math.isnan()
function offers a simple yet effective solution, especially when dealing with pure Python data types.
import math
# Assuming my_number is a float or can be converted to one
my_number = float('nan')
print(math.isnan(my_number)) # Output: True
Final Thoughts and Additional Resources
Identifying and managing NaN values is a fundamental step in cleaning and preparing your data for analysis. Whether you're working with arrays, series, or data frames, understanding the tools and methods available in Python to deal with missing data is essential. For further exploration, check out the following resources:

Adel is a Data Science educator, speaker, and VP of Media at DataCamp. Adel has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.