Iloc vs Loc in Pandas: A Guide With Examples

.loc selects data using row and column names (labels), while .iloc uses numerical indices (positions). Learn how to use both with examples.

Nov 21, 2024 · 8 min read

One of those annoying things that we’re all trying to figure out when we learn Pandas is the distinction between .loc and .iloc.

Let’s put an end to this confusion and clarify the difference between these two methods. I’ll give plenty of examples, and I hope the distinction will be much clearer by the end of this blog.

What Are .loc and .iloc in Pandas?

Both .loc and .iloc are essential attributes of Pandas DataFrames, and both are used for selecting specific subsets of data. Their purpose is to access and enable manipulating a specific part of the DataFrame instead of the whole DataFrame.

Feature	.loc	.iloc
Syntax	df.loc[row_indexer, column_indexer]	df.iloc[row_indexer, column_indexer]
Indexing Method	Label-based indexing	Position-based indexing
Used for Reference	Row and column labels (names)	Numerical indices of rows and columns (starting from 0)

As we can see from the table, the syntax looks very similar. The difference lies in how we use the row_indexer and column_indexer arguments. This is because the two methods offer different approaches to indexing the data: while .loc indexes based on label names, .iloc takes the numerical position index of rows and columns as arguments.

Let’s examine each of the two methods in detail, starting with .loc.

Using .loc: Selection by Labels

To illustrate the concepts, let's consider a hypothetical customer database represented by this DataFrame called df, with the Customer ID representing the row index:

Customer ID	Name	Country	Region	Age
C123	John Doe	United States	North America	67
C234	Petra Müller	Germany	Europe	51
C345	Ali Khan	Pakistan	Asia	19
C456	Maria Gonzalez	Mexico	North America	26
C567	David Lee	China	Asia	40

There are four primary ways to select rows with .loc. These include:

Selecting a single row
Selecting multiple rows
Selecting a slice of rows
Conditional row selection

Selecting a single row using .loc

To select a single row, we use the label of the row we want to retrieve as row_indexer. Accordingly, the syntax looks like this: df.loc['row_label']. Let’s use this to display all the information on our customer Ali Khan:

df.loc['C345']

C345
Name	Ali Khan
Country	Pakistan
Region	Asia
Age	19

Selecting multiple rows using .loc

If we want to select multiple rows that do not necessarily follow each other in order, we have to pass a list of their row labels as the row_indexer argument. This means we need to use not one but two pairs of square brackets: one for the regular .loc syntax and one for the label list.

The line df.loc[['row_label_1', 'row_label_2']] will return the two rows of the df DataFrame specified in the list. Let’s say we wanted to know not only the information on Ali Khan but as well on David Lee:

df.loc[['C345', 'C567']]

Customer ID	Name	Country	Region	Age
C345	Ali Khan	Pakistan	Asia	19
C567	David Lee	China	Asia	40

Selecting a slice of rows using .loc

We can select a range of rows by passing the first and last row labels with a colon in between: df.loc['row_label_start':'row_label_end']. We could display the first four rows of our DataFrame like this:

df.loc['C123' : 'C456']

Customer ID	Name	Country	Region	Signup Date
C123	John Doe	United States	North America	67
C234	Petra Müller	Germany	Europe	51
C345	Ali Khan	Pakistan	Asia	19
C456	Maria Gonzalez	Mexico	North America	26

There are two things to keep in mind here:

The output includes the row specified in row_label_end. This is different in .iloc, which we’ll cover later.
We only use one pair of square brackets, even though we want to retrieve multiple rows. We do not use a list to specify the various rows, so using two square brackets would return a SyntaxError.

Conditional selection of rows using .loc

We can also return rows based on a conditional expression. We can filter all rows by whether or not they fulfill a certain condition and only display the ones that do.

The corresponding syntax is df.loc[conditional_expression], with the conditional_expression being a statement about the allowed values in a specific column.

For columns with non-numeric data (like Name or Country), the statement can only use the equal or unequal operator, as there is no order between the values. We could, for instance, return all rows of customers who are not from Asia:

df.loc[df['Region'] != 'Asia']

Customer ID	Name	Country	Region	Age
C123	John Doe	United States	North America	67
C234	Petra Müller	Germany	Europe	51
C456	Maria Gonzalez	Mexico	North America	26

Selecting a single column using .loc

To select columns, we need to specify the column_indexer argument, which comes after the row_indexer argument. If we want to only specify the column_indexer, we need to somehow mark that we want to return all rows and only filter on the columns. Let’s see how we can do it!

Selecting a single column can be done by specifying the column_indexerwith the label of the respective column. To retrieve all rows, we need to specify the row_indexer with a simple colon. We arrive at a syntax that looks like this: df.loc[:, 'column_name'].

Let’s display the Name of each customer:

df.loc[:, 'Name']

Customer ID	Name
C123	John Doe
C234	Petra Müller
C345	Ali Khan
C456	Maria Gonzalez
C567	David Lee

Selecting multiple columns using .loc

Similar to selecting multiple rows, we need to pass a list of column labels if we want to return multiple columns of a DataFrame that do not necessarily follow each other in order: df.loc[:, [col_label_1, 'col_label_2']].

Assuming we wanted to add all customers’ Age to our last output, it would work like this:

df.loc[:, ['Name', 'Age']]

Customer ID	Name	Age
C123	John Doe	67
C234	Petra Müller	51
C345	Ali Khan	19
C456	Maria Gonzalez	26
C567	David Lee	40

Selecting a slice of columns using .loc

Using a colon between the labels of two columns will select all columns in the order range between the two specified columns. It is inclusive of the end column, meaning the column named col_end will also be selected in the standard syntax, which is the following: df.loc[:, 'col_start':'col_end'].

If we were interested in the Name, Country, and Region of our customers, our code line could be:

df.loc[:, 'Name':'Region']

Customer ID	Name	Country	Region
C123	John Doe	United States	North America
C234	Petra Müller	Germany	Europe
C345	Ali Khan	Pakistan	Asia
C456	Maria Gonzalez	Mexico	North America
C567	David Lee	China	Asia

Combined row and column selection using .loc

It’s also possible to specify both the row_indexer and the column_indexer. This could be used to retrieve a single piece of information, meaning one cell from the DataFrame. To do this, we specify one row and one column using the syntax df.loc['row_label', 'column_name'] .

The more useful case is to return a sub-DataFrame that focuses on exactly the set of rows and columns we are interested in. It is possible to specify both indexers as lists using the square brackets, or as a slice using the colon, and even to combine it with a conditional expression for the row selection.

Here is one example of returning the Name, Country, and Region of each customer with an Age of over 30:

df.loc[df['Age'] > 30, 'Name':'Region']

Customer ID	Name	Country	Region
C123	John Doe	United States	North America
C234	Petra Müller	Germany	Europe
C567	David Lee	China	Asia

Using .iloc: Selection by Integer Position

.iloc selects by position instead of label. This is the standard syntax of using .iloc: df.iloc[row_indexer, column_indexer]. There are two special things to look out for:

Counting starting at 0: The first row and column have the index 0, the second one index 1, etc.
Exclusivity of range end value: When using a slice, the row or column specified behind the colon is not included in the selection.

Selecting a single row using .iloc

A single row can be selected by using the integer representing the row index number as the row_indexer. We don’t need quotation marks since we are entering an integer number and not a label string as we did with .loc. To return the first row of a DataFrame called df, enter df.iloc[0].

In our example DataFrame, this very code line returns the information of John Doe:

df.iloc[0]

C123
Name	John Doe
Country	United States
Region	North America
Age	67

Selecting multiple rows using .iloc

Selecting multiple rows works in .iloc as it does in .loc—we enter the row index integers in a list with squared brackets. The syntax looks like this: df.iloc[[0, 3, 4]].

The respective output in our customer table can be seen below:

df.iloc[[0, 3, 4]]

Customer ID	Name	Country	Region	Age
C123	John Doe	United States	North America	67
C456	Maria Gonzalez	Mexico	North America	26
C567	David Lee	China	Asia	40

Selecting a slice of rows using .iloc

For selecting a slice of rows, we use a colon between two specified row index integers. Now, we have to pay attention to the exclusivity mentioned earlier.

We can take the line df.iloc[1:4] as an example to illustrate this concept. Index number 1 means the second row, so our slice starts there. The index integer 4 represents the fifth row – but since .iloc is not inclusive for slice selection, our output will include all rows up until the last before this one. Therefore, it will return the second, third, and fourth row.

Let’s prove that the line works as it should:

df.iloc[1:4]

Customer ID	Name	Country	Region	Age
C234	Petra Müller	Germany	Europe	51
C345	Ali Khan	Pakistan	Asia	19
C456	Maria Gonzalez	Mexico	North America	26

Selecting a single column using .iloc

The logic of selecting columns using .iloc follows what we have learned so far. Let’s see how it works for single columns, multiple columns and column slices.

Just like with .loc, it is important to specify the row_indexer before we can proceed to the column_indexer. To retrieve the values of the third column of df for every row, we enter df.iloc[:, 2] .

Because Region is the third column in our DataFrame, it will be retrieved as a consequence of that code line:

df.iloc[:, 2]

Customer ID	Region
C123	North America
C234	Europe
C345	Asia
C456	North America
C567	Asia

Selecting multiple columns using .iloc

To select multiple columns that are not necessarily subsequent, we can again enter a list containing integers as the column_indexer. The line df.iloc[:, [0, 3]] returns both the first and fourth columns.

In our case, the information displayed is the Name as well as the Age of each customer:

df.iloc[:, [0, 3]]

Customer ID	Name	Age
C123	John Doe	67
C234	Petra Müller	51
C345	Ali Khan	19
C456	Maria Gonzalez	26
C567	David Lee	40

Selecting a slice of columns using .iloc

For slice selection using .iloc, the logic of the column_indexer follows that of the row_indexer. The column represented by the integer after the colon is not included in the output. To retrieve the second and third columns, the code line should look like this: df.iloc[:, 1:3].

This line below returns all the geographical information we have about our customers:

df.iloc[:, 1:3]

Customer ID	Country	Region
C123	United States	North America
C234	Germany	Europe
C345	Pakistan	Asia
C456	Mexico	North America
C567	China	Asia

Combined row and column selection using .iloc

We can put together what we learned about .iloc to combine row and column selection. Again, it is possible to either return a single cell or a sub-DataFrame. To return the single cell at the intersection of row 3 and column 4, we enter df.iloc[2, 3].

Just like with .loc, we can specify both indexers as lists using the square brackets, or as a slice using the colon. If we want to select rows using conditional expressions, that is technically possible with .iloc as well, but not recommended. Using the label names and .loc is usually way more intuitive and less prone to errors.

This last example displays Country, Region and Age for the first, second and fifth row in our DataFrame:

df.iloc[[0,1,4], 1:4]

Customer ID	Country	Region	Age
C123	United States	North America	67
C234	Germany	Europe	51
C567	China	Asia	40

.iloc vs .loc: When to Use Which

Generally, there is one simple rule of thumb where the method choice depends on your knowledge of the DataFrame:

Use .loc when you know the labels (names) of the rows/columns.
Use .iloc when you know the integer positions of the rows/columns.

Some scenarios favor either .loc or .iloc by their nature. For example, iterating over rows or columns is easier and more intuitive using integers than labels. As we already mentioned, filtering rows based on conditions on column values is less prone to errors using the column label names.

Scenarios Favoring .loc	Scenarios Favoring .iloc
Your DataFrame has meaningful index/column names.	You're iterating over rows/columns by their position.
You need to filter based on conditions on column values.	The index/column names are not relevant to your task.

KeyError, NameError, and Index Error With .loc and .iloc

Let’s take a look at possible problems. A common pitfall when using .loc is encountering a KeyError. This error occurs when we attempt to access a row or column label that doesn't exist within our DataFrame. To avoid this, we always have to ensure that the labels we're using are accurate and that they match the existing labels in your DataFrame and to double-check for typos.

Additionally, it is important to always use quotation marks for the labels specified using .loc. Forgetting them will return a NameError.

An IndexError can occur when using .iloc if we specify an integer position that is outside the valid range of our DataFrame's indices. This happens when the index you're trying to access doesn't exist, either because it's beyond the number of rows or columns in your DataFrame or because it's a negative value. To prevent this error, check the dimensions of your DataFrame and use appropriate index values within the valid range.

Conclusion

I hope this blog has been helpful and the distinction between .loc and .iloc is clear by now. To learn more, here are some good next steps:

Author

Tom Farnschläder

Topics

Data Science

Python

Learn Pandas with these courses!

Course

Writing Efficient Code with pandas

4 hr

21.2K

Learn efficient techniques in pandas to optimize your Python code.

See Details

Start Course

Course

Analyzing Marketing Campaigns with pandas

4 hr

31.5K

Build up your pandas skills and answer marketing questions by merging, slicing, visualizing, and more!

See Details

Start Course

Course

Data Manipulation with pandas

4 hr

516.2K

Learn how to import and clean data, calculate statistics, and create visualizations with pandas.

See Details

Start Course

cheat-sheet

Pandas Cheat Sheet for Data Science in Python

A quick guide to the basics of the Python data analysis library Pandas, including code samples.

Karlijn Willems

cheat-sheet

Pandas Cheat Sheet: Data Wrangling in Python

This cheat sheet is a quick reference for data wrangling with Pandas, complete with code samples.

Karlijn Willems

Tutorial

Python Select Columns Tutorial

Use Python Pandas and select columns from DataFrames. Follow our tutorial with code examples and learn different ways to select your data today!

DataCamp Team

Tutorial

Pandas Tutorial: DataFrames in Python

Explore data analysis with Python. Pandas DataFrames make manipulating your data easy, from selecting or replacing columns and indices to reshaping your data.

Karlijn Willems

Tutorial

Pandas Sort Values: A Complete How-To

Use sort_values() to reorder rows by column values. Apply sort_index() to rearrange rows by the DataFrame’s index. Combine both methods to explore your data from different angles.

DataCamp Team

Tutorial

pandas read_csv() Tutorial: Importing Data

Importing data is the first step in any data science project. Learn why today's data scientists prefer the pandas read_csv() function to do this.

Kurtis Pykes

See More See More

What Are .loc and .iloc in Pandas?

Using .loc: Selection by Labels

Selecting a single row using .loc

Selecting multiple rows using .loc

Selecting a slice of rows using .loc

Conditional selection of rows using .loc

Selecting a single column using .loc

Selecting multiple columns using .loc

Selecting a slice of columns using .loc

Combined row and column selection using .loc

Using .iloc: Selection by Integer Position

Selecting a single row using .iloc

Selecting multiple rows using .iloc

Selecting a slice of rows using .iloc

Selecting a single column using .iloc

Selecting multiple columns using .iloc

Selecting a slice of columns using .iloc

Combined row and column selection using .iloc

.iloc vs .loc: When to Use Which

KeyError, NameError, and Index Error With .loc and .iloc

Conclusion

Pandas Cheat Sheet for Data Science in Python

Pandas Cheat Sheet: Data Wrangling in Python

Python Select Columns Tutorial

Pandas Tutorial: DataFrames in Python

Pandas Sort Values: A Complete How-To

pandas read_csv() Tutorial: Importing Data

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Writing Efficient Code with pandas

Analyzing Marketing Campaigns with pandas

Data Manipulation with pandas

Pandas Cheat Sheet for Data Science in Python

Pandas Cheat Sheet: Data Wrangling in Python

Python Select Columns Tutorial

Pandas Tutorial: DataFrames in Python

Pandas Sort Values: A Complete How-To

pandas read_csv() Tutorial: Importing Data

Writing Efficient Code with pandas