Data Frames in R

This tutorial takes course material from DataCamp's Introduction to R for Finance course and allows you to practice Data Frames.
Oct 2018  · 4 min read

Take our free intro to R course to further advance your R skills, or read our Introduction to Data Frames in R to get started.

Accessing and subsetting data frames (1)

Even more often than with vectors, you are going to want to subset your data frame or access certain columns. Again, one of the ways to do this is to use `[ ]`. The notation is just like matrices! Here are some examples:

Select the first row: `cash[1, ]`

Select the first column: `cash[ ,1]`

Select the first column by name: `cash[ ,"company"]`

Instructions

• Select the third row and second column of cash.
• Select the fifth row of the "year" column of cash.

If that makes sense keep going to the next exercise! If not, here is an overview video.

Accessing and subsetting data frames (2)

As you might imagine, selecting a specific column from a data frame is a common manipulation. So common, in fact, that it was given its own shortcut, the `\$`. The following return the same answer:

``````cash\$cash_flow

[1] 1000 4000  550 1500 1100  750 6000

cash[,"cash_flow"]

[1] 1000 4000  550 1500 1100  750 6000
``````

Useful right? Try it out!

Instructions

• Select the `"year"` column from `cash` using `\$`.
• Select the `"cash_flow"` column from `cash` using `\$` and multiply it by 2.
• You can delete a column by assigning it `NULL`. Run the code that deletes `"company"`.
• Now print out `cash` again.

Accessing and subsetting data frames (3)

Often, just simply selecting a column from a data frame is not all you want to do. What if you are only interested in the cash flows from company A? For more flexibility, try subset()!

``````subset(cash, company == "A")

company cash_flow year
1       A      1000    1
2       A      4000    3
3       A       550    4
``````

There are a few important things happening here:

• The first argument you pass to `subset()` is the name of your data frame, `cash`.
• Notice that you shouldn't put `company` in quotes!
• The `==` is the equality operator. It tests to find where two things are equal, and returns a logical vector. There is a lot more to learn about these relational operators, and you can learn all about them in the second finance course, Intermediate R for Finance!

Instructions

• Use `subset()` to select only the rows of `cash` corresponding to company B
• Now `subset()` rows that have cash flows due in 1 year.

In a perfect world, you could be 100% certain that you will receive all of your cash flows. But, since these are predictions about the future, there is always a chance that someone won't be able to pay! You decide to run some analysis about a worst case scenario where you only receive half of your expected cash flow. To save the worst case scenario for later analysis, you decide to add it as a new column to the data frame!

``````cash\$half_cash <- cash\$cash_flow * .5

cash

company cash_flow year half_cash
1       A      1000    1       500
2       A      4000    3      2000
3       A       550    4       275
4       B      1500    1       750
5       B      1100    2       550
6       B       750    4       375
7       B      6000    5      3000
``````

And that's it! Creating new columns in your data frame is as simple as assigning the new information to `data_frame\$new_column.` Often, the newly created column is some transformation of existing columns, so the `\$` operator really comes in handy here!

Instructions

• Create a new worst case scenario where you only receive 25% of your expected cash flow, add it to the data frame as `quarter_cash`.
• What if it took twice as long (in terms of `year`) to receive your money? Add a new column `double_year` with this scenario.

Topics

Course

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Introduction to R for Finance

4 hr
74.5K
Learn essential data structures such as lists and data frames and apply that knowledge directly to financial examples.
See Details
Start Course

Course

Introduction to Portfolio Analysis in R

5 hr
32.3K
Apply your finance and R skills to backtest, analyze, and optimize financial portfolios.

Course

Intermediate R for Finance

5 hr
33.9K
Learn about how dates work in R, and explore the world of if statements, loops, and functions using financial examples.
See More
Related

Mastering API Design: Essential Strategies for Developing High-Performance APIs

Discover the art of API design in our comprehensive guide. Learn how to create APIs like Google Maps API with best practices in defining methods, data formats, and integrating security features.

Javeria Rahim

11 min

Data Science in Finance: Unlocking New Potentials in Financial Markets

Discover the role of data science in finance, shaping tomorrow's financial strategies. Gain insights into advanced analytics and investment trends.

Shawn Plummer

9 min

5 Common Data Science Challenges and Effective Solutions

Emerging technologies are changing the data science world, bringing new data science challenges to businesses. Here are 5 data science challenges and solutions.

DataCamp Team

8 min

Navigating R Certifications in 2024: A Comprehensive Guide

Explore DataCamp's R programming certifications with our guide. Learn about Data Scientist and Data Analyst paths, preparation tips, and career advancement.

Matt Crabtree

8 min

R Markdown Tutorial for Beginners

Learn what R Markdown is, what it's used for, how to install it, what capacities it provides for working with code, text, and plots, what syntax it uses, what output formats it supports, and how to render and publish R Markdown documents.

Elena Kosourova

12 min

Introduction to DynamoDB: Mastering NoSQL Database with Node.js | A Beginner's Tutorial

Learn to master DynamoDB with Node.js in this beginner's guide. Explore table creation, CRUD operations, and scalability in AWS's NoSQL database.

Gary Alway

11 min

See MoreSee More