Blog

Using Functions in R Tutorial

Discover what R functions are, the different type of functions in R, and how to create your own functions in R.

Updated Mar 2023 · 11 min read

If you are considering starting a career in data science, the sooner you start coding, the better. No matter the programming language you’ve picked to start your learning journey, at some point you will have a date with functions.

Functions are a central concept in nearly every modern programming language, including popular programming languages in data science, such as Python, Julia, and, obviously, R.

In this tutorial, we will explore what R functions are and how you can use them. By covering the purpose, syntax, and typology of functions available in R, you will get everything you need to master this critical concept in programming. And what’s more, you will be introduced to the art of function creation.

What is a Function in R?

In programming, functions are instructions organized together to carry out a specific task. The rationale behind functions is to create self-contained programs that can be called only when needed.

With functions, programmers no longer need to write a program from scratch, thereby avoiding repetition, and improving code robustness and readability. That’s why, as a rule of thumb, it’s good practice to create a function whenever you expect to run a particular set of instructions more than twice in your code.

Functions can be used for endless purposes and can take various forms. Generically, the vast majority of functions will take input data, process it, and return a result. The data on which the function operates is specified by the so-called arguments, which can also be used to control or alter the way the function carries out the tasks.

Depending on the origin of the function, we can distinguish three main types of functions in R:

Built-in functions
Functions available in R packages
User-Defined functions (UDF)

In the following sections, we will explain the particularities of the different types of functions available in R.

Built-in Functions in R

R is a powerful programming language that comes with a wide catalog of built-in functions that can be called anytime. As a math-oriented language, R comes with a good number of functions to perform numeric operations. Below you can find a list of some of the most useful:

print(). Displays an R object on the R console
min(), max(). Calculates the minimum and maximum of a numeric vector
sum(). Calculates the sum of a numeric vector
mean(). Calculates the mean of a numeric vector
range(). Calculates the minimum and maximum values of a numeric vector
str(). Displays the structure of an R object
ncol(). Returns the number of columns of a matrix or a dataframe
length(). Returns the number of items in an R object, such as a vector, a list, and a matrix.

In the code below, you can see how simple is to use these functions to calculate certain statistics from a vector:

>>> v <- c(1, 3, 0.2, 1.5, 1.7)
>>> print(v)
[1] 1.0 3.0 0.2 1.5 1.7
>>> sum(v)
[1] 7.4
>>> mean(v)
[1] 1.48
>>> length(v)
[1] 5

You can cover R functions and more in our comprehensive R skill track, wich will help you learn to code like a programmer.

Functions in R Packages

Yet numerous and diverse, built-in functions are not enough to do all the cool stuff you can do with R, from plotting compelling data visualizations to training powerful machine learning models.

The great majority of functions to perform these tasks are available in external packages or libraries. Packages are collections of R functions, data, and compiled code in a well-defined format created to add specific functionality. Most of these packages can be used for free, and can be found in popular packages repositories, such as CRAN, which currently feature nearly 20,000 contributed packages.

To use the functions available in a package, you first will need to install it. For example, if you want to install stringr, a popular package to work with regular expressions, you can use the following statement:

install.packages('stringr')

Once you have installed it, to load it into your R environment, use the library statement

library(stringr)

Now you’re ready to use all the functions available in the stringr packages. For example, let’s try the str_detect() function, which returns a logical vector with TRUE for each element of the string that matches pattern and FALSE otherwise.

str_detect('DataCamp', "Data")
[1] TRUE

If you’re interested in knowing more about R packages and how to use them, check this DataCamp R packages tutorial.

User-Defined Functions

The best way to understand how functions in R work is by creating your own functions. The so-called User-Defined functions (UDF) are designed by programmers to carry out a specific task.

R functions normally adopt the following syntax:

function_name <- function(argument_1, argument_2) { 
  function body
  return (output)
}

We can distinguish the four main elements:

Function name. To create a UDF, first you have to assign it a name and save it as a new object. You just have to call the name whenever you want to use the function.
Arguments. The function arguments (also known as parameters) are provided within the parentheses. Arguments are key for the function to know what data to take as input and/or how to modify the behavior of the function.
Function body. Within curly brackets comes the body of the function, that is, the instructions to solve a specific task based on the information provided by the arguments.
Return statement. The return statement is required if you want the function to save as variables the result or results following the operations in the function body.

For example, if you want to create a function that calculates the mean of two numbers:

mean_two_numbers <- function(num_1, num_2) {
  mean <- (num_1 + num_2) / 2
  return (mean)
}

Now, if you want to calculate the mean of 10 and 20, just call the function as follows:

>>> mean_two_numbers(10,20)
[1] 15

Types of arguments in R functions

Arguments are vital elements in every function. While it’s theoretically possible to write a function with no parameters (see the example below), most functions do have arguments. That makes sense, as arguments tell the function what data to take as input. Equally, if we want to equip a function with various ways of performing a task, arguments will do the job.

hello <- function() {
  print('hello, my friend')
}

>>> hello()
[1] "hello, my friend"

There is no limit on the number of arguments in R functions; you can add them in the parentheses separated by commas. Generally, functions with more arguments tend to be more complex.

Once you have created a function with parameters, every time you want to use it, you will have to provide the values of the different parameters; otherwise, R will throw an error. For example, if you don’t provide the values of the two numbers to calculate their mean, our function won’t work.

However, you could avoid this error by using default arguments at the time of defining a function. Default arguments provide a default value that will be used if you call the function without providing that argument. Let’s go back again to our function to calculate the mean of two numbers. This time, we will define it to add a default argument for the second number.

mean_two_numbers <- function(num_1, num_2 = 30) {
  mean <- (num_1 + num_2) / 2
  return (mean)
}

If we now call the function without providing the value of the num_2 parameter, R will automatically take the default value (i.e., 30):

>>> mean_two_numbers(num_1 = 10)
[1] 20

Understanding Return Values in R Functions

Functions normally take some data as input and give a result as an output. In some programming languages, to save the result of a function as a variable, you need to explicitly include the return statement at the end of the body of the function. Otherwise, the function will only display a value that only exists within the scope of the variable.

This is not the case in R, which will always return a value that can be stored in a variable. However, for the sake of readability, it’s always good practice to include the return statement when defining a function.

mean_two_numbers <- function(num_1, num_2) {
  # Function with return 
  mean <- (num_1 + num_2) / 2
  return (mean)
}
mean_two_numbers_2 <- function(num_1, num_2) {
  # Function without return
  mean <- (num_1 + num_2) / 2
  mean
}

> mean_two_numbers(10,50)
[1] 30
 > mean_two_numbers_2(10,50)
[1] 30

Finally, if you want your function to return multiple values, you will have to store the different results in a list and include it in the return statement:

mean_sum <- function(num_1, num_2) {
  mean <- (num_1 + num_2) / 2
  sum <- num_1 + num_2
  return (list(mean, sum))
}

>>> mean_sum (10, 20)
[[1]]
[1] 15
[[2]]
[1] 30

Calling Functions in R

In the previous sections, we have already seen various examples on how to call a function. However, it’s important to clarify how R works under the hood when we pass the arguments.

R admits two ways of passing arguments: by position and by name. If we follow the first strategy, we will have to write the values following the same order of arguments as defined in the function.

If we pass the arguments by name, we will need to explicitly specify the names of the arguments and their associated values. Since we have matched arguments and values, the order doesn’t matter.

Finally, it’s also possible to mix the two strategies. In this case, the named arguments are extracted from the list of arguments and are matched first, while the rest of the arguments are matched by position.

hello <- function(name, surname) {
  print(paste('Hello', name, surname))
}

# Calling arguments by position
> hello('Greta','Thunberg')
[1] "Hello, Greta Thunberg"
# Calling arguments by name
> hello(surname='Thunberg', name='Greta')
[1] "Hello, Greta Thunberg"
# Calling arguments by position and by name
> hello(surname='Thunberg', 'Greta')
[1] "Hello, Greta Thunberg"

Documenting Functions in R

A good practice when creating functions is to provide documentation on how to use them, especially when functions are complex. An informal way of doing it is by adding comments in the body of functions. You can add the documentation by calling the function without parameters:

hello <- function(name, surname) {
  # Say hello to a person with name and surname
  print(paste('Hello', name, surname))
}

>>> hello
function(name, surname) {
  # Say hello to a person with name and surname
  print(paste('Hello,', name, surname))

However, if your function is part of a bigger package and you want to document it, you should write formal documentation in a separate .Rd document. You see the result of this documentation when you look at the help file for a given function, e.g. ?mean.

Conclusion

You made it to the end of the tutorial. Congratulations! Like in many other languages, functions are vital elements in R. Whether built-in, developed as a part of external packages, or even created by you, mastering functions is an important milestone in your programming journey. If you want to keep developing your function in R skills, check out the following resources!

Topics

R Programming

Data Science

R courses

Certification available

Course

Introduction to Writing Functions in R

4 hr

41.2K

Take your R skills up a notch by learning to write efficient, reusable functions.

See Details

Start Course

Certification available

Course

Introduction to R

4 hr

2.7M

Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.

See Details

Start Course

Certification available

Course

Intermediate R

6 hr

588.2K

Continue your journey to becoming an R ninja by learning about conditional statements, loops, and vector functions.

See Details

Start Course

Data Science in Finance: Unlocking New Potentials in Financial Markets

Discover the role of data science in finance, shaping tomorrow's financial strategies. Gain insights into advanced analytics and investment trends.

Shawn Plummer

9 min

5 Common Data Science Challenges and Effective Solutions

Emerging technologies are changing the data science world, bringing new data science challenges to businesses. Here are 5 data science challenges and solutions.

DataCamp Team

8 min

Navigating R Certifications in 2024: A Comprehensive Guide

Explore DataCamp's R programming certifications with our guide. Learn about Data Scientist and Data Analyst paths, preparation tips, and career advancement.

Matt Crabtree

8 min

A Data Science Roadmap for 2024

Do you want to start or grow in the field of data science? This data science roadmap helps you understand and get started in the data science landscape.

Mark Graus

10 min

R Markdown Tutorial for Beginners

Learn what R Markdown is, what it's used for, how to install it, what capacities it provides for working with code, text, and plots, what syntax it uses, what output formats it supports, and how to render and publish R Markdown documents.

Elena Kosourova

12 min

Introduction to DynamoDB: Mastering NoSQL Database with Node.js | A Beginner's Tutorial

Learn to master DynamoDB with Node.js in this beginner's guide. Explore table creation, CRUD operations, and scalability in AWS's NoSQL database.

Gary Alway

11 min

See More See More

What is a Function in R?

Built-in Functions in R

Functions in R Packages

User-Defined Functions

Types of arguments in R functions

Understanding Return Values in R Functions

Calling Functions in R

Documenting Functions in R

Conclusion

Data Science in Finance: Unlocking New Potentials in Financial Markets

5 Common Data Science Challenges and Effective Solutions

Navigating R Certifications in 2024: A Comprehensive Guide

A Data Science Roadmap for 2024

R Markdown Tutorial for Beginners

Introduction to DynamoDB: Mastering NoSQL Database with Node.js | A Beginner's Tutorial

.css-1531qan{-webkit-text-decoration:none;text-decoration:none;color:inherit;}Introduction to Writing Functions in R

Introduction to R

Intermediate R

Data Science in Finance: Unlocking New Potentials in Financial Markets

5 Common Data Science Challenges and Effective Solutions

Navigating R Certifications in 2024: A Comprehensive Guide

A Data Science Roadmap for 2024

R Markdown Tutorial for Beginners

Introduction to DynamoDB: Mastering NoSQL Database with Node.js | A Beginner's Tutorial

Introduction to Writing Functions in R