Skip to main content
HomeTutorialsPython

Finding the Size of a DataFrame in Python

There are several ways to find the size of a DataFrame in Python to fit different coding needs. Check out this tutorial for a quick primer on finding the size of a DataFrame. This tutorial presents several ways to check DataFrame size, so you’re sure to find a way that fits your needs.
Feb 2024  · 5 min read

DataFrames are a widely used data type in Python scripts. Understanding the size of a DataFrame in Python is crucial for many purposes, including determining how much memory allocation will be needed when using the DataFrame and ensuring your script does not try to call an element outside the bounds of the DataFrame. Fortunately, there are several ways to find the size of a DataFrame in Python, allowing a Python programmer to use different methods to accommodate different coding styles and situations.

Let’s discuss how to find the size of a DataFrame in Python.

Understanding Python DataFrames

DataFrames are a way of organizing information in Python that is very common in data science. There are a few key components that make DataFrames exceptionally useful in data projects.

Firstly, the information in DataFrames is organized like a table, which is easy to read and understand. Secondly, the information is mutable, which means elements in the DataFrame can be changed after creation. You can easily add new elements or update or remove existing elements within a DataFrame.

DataFrames are also useful for their ordering. Elements are kept in the DataFrame in the same order that they are added unless explicitly changed, such as by sorting.

Lastly, DataFrames contain an index, starting from 0, which allows you to select an individual element based on its position within the DataFrame.

You can learn more about DataFrames in DataCamp’s data manipulation with pandas course or this Python pandas tutorial.

Python DataFrame Size: Using df.shape in Pandas for general use

Python pandas is a library that allows analysts to easily work with DataFrames. This library has a straightforward shape method used to find the size of a DataFrame.

import pandas as pd

# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
                   'Age': [25, 30, 22],
                   'City': ['New York', 'San Francisco', 'Los Angeles']})

# Using shape to get the size
rows, columns = df.shape
print(f"Number of rows: {rows}, Number of columns: {columns}")
Output: Number of rows: 3, Number of columns: 3

The df.shape method provides information about the number of rows and columns in a DataFrame quickly and easily.

Key takeaway: df.shape is your go-to function for finding the size of a DataFrame.

Using len() for row number only

One of the simplest and most commonly used methods to find the length of a list, the built-in len() function can also be used to find the number of rows in a DataFrame. This method is concise and efficient. However, it provides limited information compared to the df.shape function.

import pandas as pd

# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
                   'Age': [25, 30, 22],
                   'City': ['New York', 'San Francisco', 'Los Angeles']})

# Using len to get the number of rows
num_rows = len(df)
print(f"Number of rows: {num_rows}")
Output: Number of rows: 3

When it comes to checking the length of a list in Python, len() is rarely used compared with df.shape. However, it can be a quick way to examine the number of rows in a DataFrame without the pandas library.

Key takeaway: len() is a built-in function to use as an alternative to pandas.

df.info() for more detailed information

For situations where a more detailed measure of size is required, try pandas’ df.info() method. This approach provides you with the number of rows and columns in the DataFrame, as well as information about the data type in each column and the number of null values.

import pandas as pd

# Creating a sample DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'],
                   'Age': [25, 30, 22],
                   'City': ['New York', 'San Francisco', 'Los Angeles']})

# Using info to get information about the DataFrame
df.info()
Output: 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   City    3 non-null      object

dtypes: int64(1), object(2)
memory usage: 204.0+ bytes

With this method, the number of rows is listed under RangeIndex. In the example above, it shows that there are three rows (called entries here) and that the index starts at 0 and ends at 2. The number of columns is listed underneath. Following these, each column’s name is listed along with the number of non-null entries in each column and its data type.

Key takeaway: df.info() can provide more detailed information about a DataFrame.

Python DataFrame Size Best Practices and Tips

When finding the size of a DataFrame in Python, there are a few best practices to keep in mind.

  1. Choose the method that works best for your DataFrame. Remember, df.shape quickly gives the number of rows and columns, while df.info gives extra information you may or may not need for your purpose.
  2. Make sure you have installed and imported any libraries you need. The pandas library is a staple when working with DataFrames.
  3. Document your work well. Make sure to use descriptive comments so future coders can decipher what you did and why.

Conclusion

There are multiple ways to find the size of a DataFrame in Python, depending on your preferences and code requirements. Whether you need simplicity or detailed insights, there's an approach that suits your specific needs.

Always consider the nature of your data and the insights you aim to derive when determining which approach to use. To learn more ways to use Python DataFrames, check out DataCamp’s Introduction to Python course or the Intermediate Python for Finance course. Or try out DataCamp’s data scientist in Python career track.

You can also check out polars, a newer competitor to pandas for high-performance DataFrame analysis. You can read more about the difference between pandas and polars or discover an introduction to using polars.


Photo of Amberle McKee
Author
Amberle McKee

I am a PhD with 13 years of experience working with data in a biological research environment. I create software in several programming languages including Python, MATLAB, and R. I am passionate about sharing my love of learning with the world.

Topics

Keep Learning Python! 

Track

Python Fundamentals

15hrs hr
Grow your programmer skills. Discover how to manipulate dictionaries and DataFrames, visualize real-world data, and write your own Python functions.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Becoming Remarkable with Guy Kawasaki, Author and Chief Evangelist at Canva

Richie and Guy explore the concept of being remarkable, growth, grit and grace, the importance of experiential learning, imposter syndrome, finding your passion, how to network and find remarkable people, measuring success through benevolent impact and much more. 
Richie Cotton's photo

Richie Cotton

55 min

Python NaN: 4 Ways to Check for Missing Values in Python

Explore 4 ways to detect NaN values in Python, using NumPy and Pandas. Learn key differences between NaN and None to clean and analyze data efficiently.
Adel Nehme's photo

Adel Nehme

5 min

Seaborn Heatmaps: A Guide to Data Visualization

Learn how to create eye-catching Seaborn heatmaps
Joleen Bothma's photo

Joleen Bothma

9 min

Test-Driven Development in Python: A Beginner's Guide

Dive into test-driven development (TDD) with our comprehensive Python tutorial. Learn how to write robust tests before coding with practical examples.
Amina Edmunds's photo

Amina Edmunds

7 min

Exponents in Python: A Comprehensive Guide for Beginners

Master exponents in Python using various methods, from built-in functions to powerful libraries like NumPy, and leverage them in real-world scenarios to gain a deeper understanding.
Satyam Tripathi's photo

Satyam Tripathi

9 min

Python Linked Lists: Tutorial With Examples

Learn everything you need to know about linked lists: when to use them, their types, and implementation in Python.
Natassha Selvaraj's photo

Natassha Selvaraj

9 min

See MoreSee More