Course
Finding interesting bits of data in a DataFrame is often easier if you change the rows' order. You can sort the rows by passing a column name to .sort_values()
.
In cases where rows have the same value (this is common if you sort on a categorical variable), you may wish to break the ties by sorting on another column. You can sort on multiple columns in this way by passing a list of column names.

Using Pandas to Sort Columns
You can change the rows' order by sorting them so that the most interesting data is at the top of the DataFrame.
Sort columns by a single variable
For example, when we apply sort_values()
on the weight_kg
column of the dogs DataFrame, we get the lightest dog at the top, Stella the Chihuahua, and the heaviest dog at the bottom, Bernie the Saint Bernard.
dogs.sort_values("weight_kg")
name breed color height_cm weight_kg date_of_birth
5 Stella Chihuahua Tan 18 2 2015-04-20
3 Cooper Schnauzer Gray 49 17 2011-12-11
0 Bella Labrador Brown 56 24 2013-07-01
1 Charlie Poodle Black 43 24 2016-09-16
2 Lucy Chow Chow Brown 46 24 2014-08-25
4 Max Labrador Black 59 29 2017-01-20
6 Bernie St. Bernard White 77 74 2018-02-27
Setting the ascending
argument to False will sort the data the other way round, from heaviest to lightest dog.
dogs.sort_values("weight_kg", ascending=False)
name breed color height_cm weight_kg date_of_birth
6 Bernie St. Bernard White 77 74 2018-02-27
4 Max Labrador Black 59 29 2017-01-20
0 Bella Labrador Brown 56 24 2013-07-01
1 Charlie Poodle Black 43 24 2016-09-16
2 Lucy Chow Chow Brown 46 24 2014-08-25
3 Cooper Schnauzer Gray 49 17 2011-12-11
5 Stella Chihuahua Tan 18 2 2015-04-20
Sort columns by multiple variables
We can sort by multiple variables by passing a list of column names to sort_values
. Here, we sort first by weight, then by height. Now, Charlie, Lucy, and Bella are ordered from shortest to tallest, even though they all weigh the same.
dogs.sort_values(["weight_kg", "height_cm"])
name breed color height_cm weight_kg date_of_birth
5 Stella Chihuahua Tan 18 2 2015-04-20
3 Cooper Schnauzer Gray 49 17 2011-12-11
1 Charlie Poodle Black 43 24 2016-09-16
2 Lucy Chow Chow Brown 46 24 2014-08-25
0 Bella Labrador Brown 56 24 2013-07-01
4 Max Labrador Black 59 29 2017-01-20
6 Bernie St. Bernard White 77 74 2018-02-27
To change the direction values are sorted in, pass a list to the ascending argument to specify which direction sorting should be done for each variable. Now, Charlie, Lucy, and Bella are ordered from tallest to shortest.
dogs.sort_values(["weight_kg", "height_cm"], ascending=[True, False])
name breed color height_cm weight_kg date_of_birth
5 Stella Chihuahua Tan 18 2 2015-04-20
3 Cooper Schnauzer Gray 49 17 2011-12-11
0 Bella Labrador Brown 56 24 2013-07-01
2 Lucy Chow Chow Brown 46 24 2014-08-25
1 Charlie Poodle Black 43 24 2016-09-16
4 Max Labrador Black 59 29 2017-01-20
6 Bernie St. Bernard White 77 74 2018-02-27
Using Pandas to Sort by Rows
Sometimes you may want to reorder rows based on their row labels (i.e., the DataFrame’s index) rather than by specific columns. If that is the case, you can use the sort_index()
method instead of sort_values()
. Remember that, by default, sort_index()
will sort your rows in ascending order by their index:
# Sort rows by their index (ascending)
dogs_sorted = dogs.sort_index()
print(dogs_sorted)
If you need to sort the rows in descending order, just pass ascending=False
:
# Sort rows by their index (descending)
dogs_sorted_desc = dogs.sort_index(ascending=False)
print(dogs_sorted_desc)
Similarly, if you have a multi-level (hierarchical) index, sort_index()
can also handle that by sorting multiple levels. You just pass a list to the level
or ascending
parameters (just like earlier we had to pass a list for sort_values()
):
# Sort rows by multiple levels of a multi-level index
dogs_sorted_multi = dogs.sort_index(level=[0, 1], ascending=[True, False])
print(dogs_sorted_multi)
Pandas Sort Values Interactive Example
In the following example, you will sort homelessness
by the number of homeless individuals, from smallest to largest, and save this as homelessness_ind
. Finally, you will print the head of the sorted DataFrame.
# Sort homelessness by individuals
homelessness_ind = homelessness.sort_values("individuals")
# Print the top few rows
print(homelessness_ind.head())
When we run the above code, it produces the following result:
region state individuals family_members state_pop
50 Mountain Wyoming 434.0 205.0 577601
34 West North Central North Dakota 467.0 75.0 758080
7 South Atlantic Delaware 708.0 374.0 965479
39 New England Rhode Island 747.0 354.0 1058287
45 New England Vermont 780.0 511.0 624358
To learn more about sorting and subsetting the data, please see this video from our course Data Manipulation with pandas.
This content is taken from DataCamp’s Data Manipulation with pandas course by Maggie Matsui and Richie Cotton.
Further Learning
We have learned in this article, among other things, when to use sort_index()
vs. sort_values()
: Use sort_values()
when you want to reorder rows based on column values; use sort_index()
when you want to reorder rows based on the row labels (the DataFrame’s index).
We have many other useful pandas tutorials so you can keep learning, including The ultimate Guide to Pandas for Beginners, so you can keep practicing. We also have other specific how-to's for common issues, including How to Import CSV Data into Pandas and How to Join DataFrames in Pandas. Also, remember to take our Python Programming skill track to keep improving your skills.