Skip to main content

Pandas Sort Values Tutorial

Learn how to sort rows of data in a pandas Dataframe using the .sort_values() function.
Sep 25, 2020  · 4 min read

Finding interesting bits of data in a DataFrame is often easier if you change the rows' order. You can sort the rows by passing a column name to .sort_values().

In cases where rows have the same value (this is common if you sort on a categorical variable), you may wish to break the ties by sorting on another column. You can sort on multiple columns in this way by passing a list of column names.

column names

Modifying the Order of Columns

You can change the rows' order by sorting them so that the most interesting data is at the top of the dataframe.

For example, when we apply sort_values() on the weight_kg column of the dogs dataframe, we get the lightest dog at the top, Stella the Chihuahua, and the heaviest dog at the bottom, Bernie the Saint Bernard.

dogs.sort_values("weight_kg")
      name        breed  color  height_cm  weight_kg date_of_birth
5   Stella    Chihuahua    Tan         18          2    2015-04-20
3   Cooper    Schnauzer   Gray         49         17    2011-12-11
0    Bella     Labrador  Brown         56         24    2013-07-01
1  Charlie       Poodle  Black         43         24    2016-09-16
2     Lucy    Chow Chow  Brown         46         24    2014-08-25
4      Max     Labrador  Black         59         29    2017-01-20
6   Bernie  St. Bernard  White         77         74    2018-02-27

Setting the ascending argument to False will sort the data the other way round, from heaviest to lightest dog.

dogs.sort_values("weight_kg", ascending=False)
      name        breed  color  height_cm  weight_kg date_of_birth
6   Bernie  St. Bernard  White         77         74    2018-02-27
4      Max     Labrador  Black         59         29    2017-01-20
0    Bella     Labrador  Brown         56         24    2013-07-01
1  Charlie       Poodle  Black         43         24    2016-09-16
2     Lucy    Chow Chow  Brown         46         24    2014-08-25
3   Cooper    Schnauzer   Gray         49         17    2011-12-11
5   Stella    Chihuahua    Tan         18          2    2015-04-20

Sorting by Multiple Variables

We can sort by multiple variables by passing a list of column names to sort_values. Here, we sort first by weight, then by height. Now, Charlie, Lucy, and Bella are ordered from shortest to tallest, even though they all weigh the same.

dogs.sort_values(["weight_kg", "height_cm"])
      name        breed  color  height_cm  weight_kg date_of_birth
5   Stella    Chihuahua    Tan         18          2    2015-04-20
3   Cooper    Schnauzer   Gray         49         17    2011-12-11
1  Charlie       Poodle  Black         43         24    2016-09-16
2     Lucy    Chow Chow  Brown         46         24    2014-08-25
0    Bella     Labrador  Brown         56         24    2013-07-01
4      Max     Labrador  Black         59         29    2017-01-20
6   Bernie  St. Bernard  White         77         74    2018-02-27

To change the direction values are sorted in, pass a list to the ascending argument to specify which direction sorting should be done for each variable. Now, Charlie, Lucy, and Bella are ordered from tallest to shortest.

dogs.sort_values(["weight_kg", "height_cm"], ascending=[True, False])
      name        breed  color  height_cm  weight_kg date_of_birth
5   Stella    Chihuahua    Tan         18          2    2015-04-20
3   Cooper    Schnauzer   Gray         49         17    2011-12-11
0    Bella     Labrador  Brown         56         24    2013-07-01
2     Lucy    Chow Chow  Brown         46         24    2014-08-25
1  Charlie       Poodle  Black         43         24    2016-09-16
4      Max     Labrador  Black         59         29    2017-01-20
6   Bernie  St. Bernard  White         77         74    2018-02-27

Interactive Example

In the following example, you will sort homelessness by the number of homeless individuals, from smallest to largest, and save this as homelessness_ind. Finally, you will print the head of the sorted DataFrame.

# Sort homelessness by individuals
homelessness_ind = homelessness.sort_values("individuals")

# Print the top few rows
print(homelessness_ind.head())

When we run the above code, it produces the following result:

                region         state  individuals  family_members  state_pop
50            Mountain       Wyoming        434.0           205.0     577601
34  West North Central  North Dakota        467.0            75.0     758080
7       South Atlantic      Delaware        708.0           374.0     965479
39         New England  Rhode Island        747.0           354.0    1058287
45         New England       Vermont        780.0           511.0     624358

Try it for yourself.

To learn more about sorting and subsetting the data, please see this video from our course Data Manipulation with pandas.

This content is taken from DataCamp’s Data Manipulation with pandas course by Maggie Matsui and Richie Cotton.

We have many other useful pandas tutorials including:

Topics

Python courses

course

Introduction to Python

4 hr
5.7M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Sorting in Python Tutorial

Discover what the sort method is and how to do it.
DataCamp Team's photo

DataCamp Team

1 min

tutorial

Pandas Drop Duplicates Tutorial

Learn how to drop duplicates in Python using pandas.
DataCamp Team's photo

DataCamp Team

4 min

tutorial

How to Drop Columns in Pandas Tutorial

Learn how to drop columns in a pandas DataFrame.
DataCamp Team's photo

DataCamp Team

3 min

tutorial

How to Sort a Dictionary by Value in Python

Learn efficient methods to sort a dictionary by values in Python. Discover ascending and descending order sorting and bonus tips for key sorting.
Neetika Khandelwal's photo

Neetika Khandelwal

5 min

tutorial

Pandas Tutorial: DataFrames in Python

Explore data analysis with Python. Pandas DataFrames make manipulating your data easy, from selecting or replacing columns and indices to reshaping your data.
Karlijn Willems's photo

Karlijn Willems

20 min

tutorial

pandas read_csv() Tutorial: Importing Data

Importing data is the first step in any data science project. Learn why today's data scientists prefer the pandas read_csv() function to do this.
Kurtis Pykes 's photo

Kurtis Pykes

9 min

See MoreSee More