Skip to main content
HomeTutorialsR Programming

Mastering Data Structures in the R Programming Language

Read our comprehensive guide on how to work with data structures in R programming: vectors, lists, arrays, matrices, factors, and data frames.
May 2024  · 6 min read

R, a popular statistical programming language tailored for data wrangling, analysis, and visualization, is equipped with a range of data structures that are optimized for handling various types of data tasks effectively. Mastering R's diverse array of data structures is a gateway to unlocking its full potential and transforming data into compelling insights.

As we start, consider taking our Introduction to R Programming course, which lets you practice these concepts with real datasets. Let's dive in. 

What are Data Structures in the R Language?

Data structures in R help organize data for analysis. They can be simple, holding only one type of data, or complex, supporting diverse data types. These data structures are tailored to the specific needs that arise during data-driven projects. The primary data structures in R are vectors, lists, matrices, arrays, factors, and data frames.

Data structures in RData structures in R

The Different Kinds of Data Structures in R

Let's take a little time to familiarize ourselves with the data structures in R. In the process, we can familiarize ourselves with common R functions. 

Vectors in R

Vectors are the simplest form of data structure in R. They are a collection of elements of the same type, such as numeric, character, or logical.

R is designed to work well with vectors and is specially designed to work with what are called vectorized operations. This means you can apply a function to a vector without needing to loop through its elements explicitly. For example, adding two vectors together adds corresponding elements in a faster and more concise way than doing so element-wise through loops.

Here we create three vectors for numerical, character, and logical types using the function c().

numeric_vector = c(10, 20, 30)character_vector = c("apple", "banana", "cherry")logical_vector = c(TRUE, FALSE, TRUE)

print(numeric_vector)print(character_vector)print(logical_vector)

three example vectors in RExample vectors in R


You can access the elements of the vector using square brackets.

# Access the first 
elementprint(numeric_vector[1])

# Access multiple elements
print(character_vector[c(1, 3)])

Accessing vector elements in RAccessing vector elements in R


You can also perform mathematical and logical operations with vectors.

# Adding a scalar value to the vector
print(numeric_vector + 2)

# Multiplying elements by a scalar
print(numeric_vector * 10)

# Perform logical operations - Check which elements are greater than 15
print(numeric_vector > 15)

Vector operations in RVector operations in R

Other important operations include summing, finding the mean, and finding the minimum and maximum values. 

# Summation
print(sum(numeric_vector))

# Mean
print(mean(numeric_vector))

# Max and min
print(max(numeric_vector))
print(min(numeric_vector))

Output from basic R functionsOutput from basic R functions

Matrices in R

Matrices are two-dimensional arrays that store data of a single type. They are particularly useful for mathematical computations. In R, you can create a matrix using the matrix() function.

my_matrix <- matrix(1:9, nrow=3, ncol=3)print(my_matrix)

Creating a matrix in RCreating a matrix in R

You can access the elements, rows, columns, or subsets of the matrix using indices.

# Access the element in the first row and second column 
print(my_matrix[1,2]) 

# Access the second row 
print(my_matrix[2,]) 

# Access the third column 
print(my_matrix[,3])

Accessing the elements of a matrix in R

You can also perform mathematical operations on matrices.

another_matrix <- matrix(9:1, nrow=3, ncol=3)

# Element-wise addition
print(my_matrix + another_matrix)

# Element-wise subtraction
print(my_matrix - another_matrix)

# Element-wise multiplication
print(my_matrix * another_matrix)

# Element-wise division
print(my_matrix / another_matrix)

Mathematical operations on matrices in R

Mathematical operations on matrices in R


Arrays in R

Arrays are an extension to matrices, and can have more than two dimensions, providing a way to store multidimensional data efficiently. You can create an array in R using the array()function.

The dim = c(2, 2, 2) argument sets the dimensions of the array. In this case, the dimensions indicate that the array should have three dimensions (a 3D array), structured as 2 rows, 2 columns, and 2 layers (or depth). The output of the code above is shown below. You can explore more on arrays in our Arrays in R tutorial.

my_array = array(1:8, dim = c(2, 2, 2))
print(my_array)

Creating an array in R

Lists in R

Lists are versatile data structures in R that can hold a mix of objects of different types and sizes. You can create a list using the list() function.

my_list <- list(name="DataCamp", year=2024, scores=c(80, 90, 85), active=TRUE)

print(my_list)

print the contents of a list in R

The contents of a list in R

Once you’ve created the list, you can access its elements using either indices or the name.

print(my_list[[3]])
print(my_list$name)
print(my_list$scores)

Accessing the elements of a list in RAccessing the elements of a list in R

Factors in R

Unlike vectors, matrices, or lists, factors do not define a structure of data storage but rather describe how data should be treated within these structures. So, factors can be thought of as a data type akin to integer types or character types.

I included factors in the list because they are crucial for handling categorical data within R’s ecosystem, especially in scenarios involving statistical techniques where the distinction between categorical and continuous variables is significant. This is different than Python which requires categorical features to be converted to numerical ones through dummy or one hot encoding. 

You can create factors using the factor() function as shown below:

gender <- factor(c("male", "female", "female", "male"))
print(gender)

Factors in R

Factors in R

Factors are also stored as levels and can be ordered or unordered. 

levels(gender) <- c("Female", "Male") 
print(gender)

Factor levels in R

Factor levels in R

If you want to look at the distribution of the factor variable, you can use the summary() function. The output shows that there are two entries each for Female and Male, respectively.

summary(gender)

Summary in RSummary in R

Data frames in R

Data frames are the most popular and widely used data structure in R. They are especially convenient because they can contain different types of data across different columns. You can consider them to be similar to tables in a database or CSV files as they store and manage data in a two-dimensional, square or rectangular format.

You can create a data frame with the data.frame() command.

data_frame <- data.frame( Names = c("Kiran", "Ajey", "Carol"), Age = c(25, 30, 35), Gender = c("Female", "Male", "Female"))

# Printing the data frame
print(data_frame)

A data frame in RA data frame in R


You can access the contents of the data frame in several ways, either by columns or by rows.

# Prints all names - use the $ sign 
print(data_frame$Names) 

# Access the first row 
print(data_frame[1, ]) 

# Access the second column 
print(data_frame[, 2])

The elements of a data frame in RThe elements of a data frame in R


Data frames are super useful in data wrangling and analysis because they support a range of operations like subsetting and sorting, and they provide an easy way to print descriptive statistics.

Subsetting a data frame in R

subset_female <- data_frame[data_frame$Gender == 'Female', ]
print(subset_female)

A subset of a data frame in RA subset of a data frame in R


Sorting a data frame in R

# Sorting
data_frame <- data_frame[order(data_frame$Age), ]
print(data_frame)

A sorted data frame in RA sorted data frame in R


Summarizing a data frame in R

# Statistical summary
summary(data_frame)

Descriptive or statistical summary in RDescriptive or statistical summary in R


You can see from the above outputs how easy it is to perform data frame operations in R. As you delve deeper into data analysis, you'll find that data frames are incredibly versatile and powerful for data manipulation, analysis, and visualization. If you are interested in exploring more functionalities, check out our Data Frames in R tutorial. 

Continuing with R

This tutorial has provided you with insights into the various data structures in R and their application in real-world data analysis situations. Mastering these structures will improve your analytical skills, enabling you to effectively manage and analyze data.

As you continue your journey, you will discover that R is a unique programming language renowned for its robust statistical capabilities and extensive libraries. It is a worthwhile investment for anyone curious about the world of data analysis. 

For more R tutorials, please check out the following links:

If you are interested in learning more, you can read through our R Programming Interview Questions & Answers, which is a great resource if you are preparing for an interview or just learning. The article offers a wide range of questions and answers that will help you identify any gaps in your knowledge as you continue your journey. 


Photo of Vikash Singh
Author
Vikash Singh
LinkedIn

Seasoned professional in data science, artificial intelligence, analytics, and data strategy.

Topics

Learn R with DataCamp

Course

Introduction to R

4 hr
2.7M
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Data Types in R

Learn about data types and their importance in a programming language. More specifically, learn how to use various data types like vector, matrices, lists, and dataframes in the R programming language.
Aditya Sharma's photo

Aditya Sharma

12 min

tutorial

Utilities in R Tutorial

Learn about several useful functions for data structure manipulation, nested-lists, regular expressions, and working with times and dates in the R programming language.
Aditya Sharma's photo

Aditya Sharma

18 min

tutorial

Matrices in R Tutorial

Learn all about R's matrix, naming rows and columns, accessing elements also with computation like addition, subtraction, multiplication, and division.

Olivia Smith

7 min

tutorial

Sorting Data in R

How to sort a data frame in R.
DataCamp Team's photo

DataCamp Team

2 min

tutorial

Introduction to Data frames in R

This tutorial takes course material from DataCamp's Introduction to R course and allows you to practice data frames.
Ryan Sheehy's photo

Ryan Sheehy

5 min

tutorial

Arrays in R

Learn about Arrays in R, including indexing with examples, along with the creation and addition of matrices and the apply() function.

Olivia Smith

8 min

See MoreSee More