Track
DataFrames are a widely used data type in Python scripts. Understanding the size of a DataFrame in Python is crucial for many purposes, including determining how much memory allocation will be needed when using the DataFrame and ensuring your script does not try to call an element outside the bounds of the DataFrame. Fortunately, there are several ways to find the size of a DataFrame in Python, allowing a Python programmer to use different methods to accommodate different coding styles and situations.
Let’s discuss how to find the size of a DataFrame in Python.
Understanding Python DataFrames
DataFrames are a way of organizing information in Python that is very common in data science. There are a few key components that make DataFrames exceptionally useful in data projects.
Firstly, the information in DataFrames is organized like a table, which is easy to read and understand. Secondly, the information is mutable, which means elements in the DataFrame can be changed after creation. You can easily add new elements or update or remove existing elements within a DataFrame.
DataFrames are also useful for their ordering. Elements are kept in the DataFrame in the same order that they are added unless explicitly changed, such as by sorting.
Lastly, DataFrames contain an index, starting from 0, which allows you to select an individual element based on its position within the DataFrame.
You can learn more about DataFrames in DataCamp’s data manipulation with pandas course or this Python pandas tutorial.
Python DataFrame Size: Using df.shape in Pandas for general use
Python pandas is a library that allows analysts to easily work with DataFrames. This library has a straightforward shape method used to find the size of a DataFrame.
import pandas as pd
# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
'City': ['New York', 'San Francisco', 'Los Angeles']})
# Using shape to get the size
rows, columns = df.shape
print(f"Number of rows: {rows}, Number of columns: {columns}")
Output: Number of rows: 3, Number of columns: 3
The df.shape
method provides information about the number of rows and columns in a DataFrame quickly and easily.
Key takeaway: df.shape
is your go-to function for finding the size of a DataFrame.
Using len() for row number only
One of the simplest and most commonly used methods to find the length of a list, the built-in len()
function can also be used to find the number of rows in a DataFrame. This method is concise and efficient. However, it provides limited information compared to the df.shape
function.
import pandas as pd
# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
'City': ['New York', 'San Francisco', 'Los Angeles']})
# Using len to get the number of rows
num_rows = len(df)
print(f"Number of rows: {num_rows}")
Output: Number of rows: 3
When it comes to checking the length of a list in Python, len()
is rarely used compared with df.shape
. However, it can be a quick way to examine the number of rows in a DataFrame without the pandas library.
Key takeaway: len()
is a built-in function to use as an alternative to pandas.
df.info() for more detailed information
For situations where a more detailed measure of size is required, try pandas’ df.info()
method. This approach provides you with the number of rows and columns in the DataFrame, as well as information about the data type in each column and the number of null values.
import pandas as pd
# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22],
'City': ['New York', 'San Francisco', 'Los Angeles']})
# Using info to get information about the DataFrame
df.info()
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Name 3 non-null object
1 Age 3 non-null int64
2 City 3 non-null object
dtypes: int64(1), object(2)
memory usage: 204.0+ bytes
With this method, the number of rows is listed under RangeIndex. In the example above, it shows that there are three rows (called entries here) and that the index starts at 0 and ends at 2. The number of columns is listed underneath. Following these, each column’s name is listed along with the number of non-null entries in each column and its data type.
Key takeaway: df.info()
can provide more detailed information about a DataFrame.
Python DataFrame Size Best Practices and Tips
When finding the size of a DataFrame in Python, there are a few best practices to keep in mind.
- Choose the method that works best for your DataFrame. Remember,
df.shape
quickly gives the number of rows and columns, whiledf.info
gives extra information you may or may not need for your purpose. - Make sure you have installed and imported any libraries you need. The pandas library is a staple when working with DataFrames.
- Document your work well. Make sure to use descriptive comments so future coders can decipher what you did and why.
Conclusion
There are multiple ways to find the size of a DataFrame in Python, depending on your preferences and code requirements. Whether you need simplicity or detailed insights, there's an approach that suits your specific needs.
Always consider the nature of your data and the insights you aim to derive when determining which approach to use. To learn more ways to use Python DataFrames, check out DataCamp’s Introduction to Python course or the Intermediate Python for Finance course. Or try out DataCamp’s data scientist in Python career track.
You can also check out polars, a newer competitor to pandas for high-performance DataFrame analysis. You can read more about the difference between pandas and polars or discover an introduction to using polars.

I am a PhD with 13 years of experience working with data in a biological research environment. I create software in several programming languages including Python, MATLAB, and R. I am passionate about sharing my love of learning with the world.