Skip to main content

# Introduction to Data frames in R

This tutorial takes course material from DataCamp's Introduction to R course and allows you to practice data frames.
Oct 2018  · 5 min read

If you want to take our Introduction to R course, here is the link.

## What's a data frame?

You may remember from the chapter about matrices that all the elements that you put in a matrix should be of the same type. Back then, your data set on Star Wars only contained numeric elements.

When doing a market research survey, however, you often have questions such as:

• 'Are you married?' or 'yes/no' questions (`logical`)
• 'How old are you?' (`numeric`)
• 'What is your opinion on this product?' or other 'open-ended' questions (`character`)
• ...

The output, namely the respondents' answers to the questions formulated above, is a data set of different data types. You will often find yourself working with data sets that contain different data types instead of only one.

A data frame has the variables of a data set as columns and the observations as rows. This will be a familiar concept for those coming from different statistical software packages such as SAS or SPSS.

### Instructions

Click 'Submit Answer'. The data from the built-in example data frame `mtcars` will be printed to the console.

## Quick, have a look at your data set

Wow, that is a lot of cars!

Working with large data sets is not uncommon in data analysis. When you work with (extremely) large data sets and data frames, your first task as a data analyst is to develop a clear understanding of its structure and main elements. Therefore, it is often useful to show only a small part of the entire data set.

So how to do this in R? Well, the function `head()` enables you to show the first observations of a data frame. Similarly, the function `tail()` prints out the last observations in your data set.

Both `head()` and `tail()` print a top line called the 'header', which contains the names of the different variables in your data set.

### Instructions

Call `head()` on the `mtcars` data set to have a look at the header and the first observations.

## Have a look at the structure

Another method that is often used to get a rapid overview of your data is the function `str()`. The function `str()` shows you the structure of your data set. For a data frame it tells you:

The total number of observations (e.g. 32 car types) The total number of variables (e.g. 11 car features) A full list of the variables names (e.g. `mpg`, `cyl` ... ) The data type of each variable (e.g. `num`) The first observations Applying the `str()` function will often be the first thing that you do when receiving a new data set or data frame. It is a great way to get more insight in your data set before diving into the real analysis.

### Instructions

Investigate the structure of `mtcars`. Make sure that you see the same numbers, variables, and data types as mentioned above.

## Creating a data frame

Since using built-in data sets is not even half the fun of creating your own data sets, the rest of this chapter is based on your personally developed data set. Put your jet pack on because it is time for some space exploration!

As a first goal, you want to construct a data frame that describes the main characteristics of eight planets in our solar system. According to your good friend Buzz, the main features of a planet are:

• The type of planet (Terrestrial or Gas Giant).
• The planet's diameter relative to the diameter of the Earth.
• The planet's rotation across the sun relative to that of the Earth.
• If the planet has rings or not (TRUE or FALSE).

After doing some high-quality research on Wikipedia, you feel confident enough to create the necessary vectors: `name`, `type`, `diameter`, `rotation` and `rings`; these vectors have already been coded up on the right. The first element in each of these vectors correspond to the first observation.

You construct a data frame with the `data.frame()` function. As arguments, you pass the vectors from before: they will become the different columns of your data frame. Because every column has the same length, the vectors you pass should also have the same length. But don't forget that it is possible (and likely) that they contain different types of data.

### Instructions

Use the function `data.frame()` to construct a data frame. Pass the vectors `name`, `type`, `diameter`, `rotation` and `rings` as arguments to `data.frame()`, in this order. Call the resulting data frame `planets_df`.

If you want to learn more from this course, here is the link

### Introduction to R

Beginner
4 hours
2,396,975
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See Details

### Intermediate R

Beginner
6 hours
532,427
Continue your journey to becoming an R ninja by learning about conditional statements, loops, and vector functions.

### Exploratory Data Analysis in R

Beginner
4 hours
80,947
Learn how to use graphical and numerical techniques to begin uncovering the structure of your data.
See all courses
Related

### How to Become a Data Scientist in 8 Steps

Find out everything you need to know about becoming a data scientist, and find out whether it’s the right career for you!

Jose Jorge Rodriguez Salgado

12 min

### Predicting FIFA World Cup Qatar 2022 Winners

Learn to use Elo ratings to quantify national soccer team performance, and see how the model can be used to predict the winner of FIFA World Cup Qatar 2022.

Arne Warnke

### How Data Science is Changing Soccer

With the Fifa 2022 World Cup upon us, learn about the most widely used data science use-cases in soccer.

Richie Cotton

### Regular Expressions Cheat Sheet

Regular expressions (regex or regexp) are a pattern of characters that describe an amount of text. Regular expressions are one of the most widely used tools in natural language processing and allow you to supercharge common text data manipulation tasks. Use this cheat sheet as a handy reminder when working with regular expressions.

DataCamp Team

### ggplot2 Cheat Sheet

ggplot2 is considered to be one of the most robust data visualization packages in any programming language. Use this cheat sheet to guide your ggplot2 learning journey.

DataCamp Team

### A Guide to R Regular Expressions

Explore regular expressions in R, why they're important, the tools and functions to work with them, common regex patterns, and how to use them.

Elena Kosourova

16 min

See MoreSee More