Skip to main content

Subsetting in R Tutorial

Find out how to access your dataframe's data with subsetting. Follow our tutorial and learn how to use R's subset() function today!
Jun 2020  · 4 min read

Subsetting in R is a useful indexing feature for accessing object elements. It can be used to select and filter variables and observations. You can use brackets to select rows and columns from your dataframe.

Selecting Rows

debt[3:6, ]
      name  payment
3      Dan      150
4      Rob       50
5      Rob       75
6      Rob      100

Here we selected rows 3 through 6 of debt. One thing to look at is the simplification that happens when you select a single column.

Start Learning R For Free

Introduction to Importing Data in R

3 hr
In this course, you will learn to read CSV, XLS, and text files in R using tools like readxl and data.table.
See DetailsRight Arrow
Start Course

Selecting Rows From a Specific Column

Selecting the first three rows of just the payment column simplifies the result into a vector.

debt[1:3, 2]
100 200 150

Dataframe Formatting

To keep it as a dataframe, just add drop=False as shown below:

debt[1:3, 2, drop = FALSE]
1       100
2       200
3       150

Selecting a Specific Column [Shortcut]

To select a specific column, you can also type in the name of the dataframe, followed by a $, and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe. When running this script, R will simplify the result as a vector.

100 200 150 50 75 100

Using subset() for More Power

When looking to create more complex subsets or a subset based on a condition, the next step up is to use the subset() function. For example, what if you wanted to look at debt from someone named Dan. You could just use the brackets to select their debt and total it up, but it isn't a very robust way of doing things, especially with potential changes to the data set.

# This works, but is not informative nor robust
debt[1:3, ]

subset() Function

A better way to do this is to use the subset() function to select the rows where the name column is equal to Dan. Notice that their needs to be a double equals sign, known as a relational operator.

# This works, but is not informative nor robust
debt[1:3, ]

# Much more informative!
subset(debt, name == "Dan")
      name     payment
1      Dan         100
2      Dan         200
3      Dan         150

Using a subset() Function on a Numeric Column

We can also subset on numeric columns. If we wanted to see rows where payments equal $100, you would do the following:

subset(debt, payment == 100)
      name  payment
1      Dan      100
6      Rob      100

Accessing and Subsetting Dataframes

Moving to this next example, what if you are only interested in the cash flows from company A?

subset(cash, company == "A")
      company  cash_flow  year
1           A       1000     1
2           A       4000     3
3           A        550     4


  • The first argument you pass to subset() is the name of your dataframe, cash.
  • Notice that you shouldn't put company in quotes!
  • The == is the equality operator. It tests to find where two things are equal and returns a logical vector.

Interactive Example of the subset() Method

In the below example, you will use the subset() method to select only the rows of cash corresponding to company B. And then, subset() rows that have cash flows due in 1 year.

# Rows about company B
subset(cash, company == "B")

# Rows with cash flows due in 1 year
subset(cash, year == 1)

When you run the above code, it produces the following result:

  company cash_flow year
4       B      1500    1
5       B      1100    2
6       B       750    4
7       B      6000    5
  company cash_flow year
1       A      1000    1
4       B      1500    1

Try it for yourself.

To learn more about accessing and subsetting dataframes in R, please see this video from our course Introduction to R for Finance.

This content is taken from DataCamp’s Introduction to R for Finance course by Lore Dirick.

Introduction to R

4 hours
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See DetailsRight Arrow
Start Course

Intermediate R

6 hours
Continue your journey to becoming an R ninja by learning about conditional statements, loops, and vector functions.

Introduction to Importing Data in R

3 hours
In this course, you will learn to read CSV, XLS, and text files in R using tools like readxl and data.table.
See all coursesRight Arrow
Data Science Concept Vector Image

How to Become a Data Scientist in 8 Steps

Find out everything you need to know about becoming a data scientist, and find out whether it’s the right career for you!
Jose Jorge Rodriguez Salgado's photo

Jose Jorge Rodriguez Salgado

12 min

Predicting FIFA World Cup Qatar 2022 Winners

Learn to use Elo ratings to quantify national soccer team performance, and see how the model can be used to predict the winner of FIFA World Cup Qatar 2022.

Arne Warnke

DC Data in Soccer Infographic.png

How Data Science is Changing Soccer

With the Fifa 2022 World Cup upon us, learn about the most widely used data science use-cases in soccer.
Richie Cotton's photo

Richie Cotton

Regular Expressions Cheat Sheet

Regular expressions (regex or regexp) are a pattern of characters that describe an amount of text. Regular expressions are one of the most widely used tools in natural language processing and allow you to supercharge common text data manipulation tasks. Use this cheat sheet as a handy reminder when working with regular expressions.
DataCamp Team's photo

DataCamp Team

ggplot2 Cheat Sheet

ggplot2 is considered to be one of the most robust data visualization packages in any programming language. Use this cheat sheet to guide your ggplot2 learning journey.
DataCamp Team's photo

DataCamp Team

A Guide to R Regular Expressions

Explore regular expressions in R, why they're important, the tools and functions to work with them, common regex patterns, and how to use them.
Elena Kosourova 's photo

Elena Kosourova

16 min

See MoreSee More