Skip to main content

R median() Function: Find the Middle Value

Learn how to quickly find the middle value of your data using the R median() function. Discover tips for handling missing values and grouping data by categories.
Jun 20, 2025  · 4 min read

Finding the median is a fundamental part of data analysis, especially when you're dealing with skewed distributions or outliers. 

In R, the median() function offers a simple, built-in way to calculate a very important (non-parametric!) measure of central tendency. I’ll show you the ropes:

What Does R median() Do?

The median() function examines your numeric data and returns (one interpretation of) the central value.

median(numeric_vector)

Here, numeric_vector refers to a numeric vector or a similar object, such as a column within a data frame.

A Simple Example Using R median()

Let’s see this in action. Suppose you have a small vector of numbers:

sales_numbers <- c(3, 5, 8, 2, 7)

median(sales_numbers)

R automatically sorts the numbers (2, 3, 5, 7, 8), then returns 5. This is the value sitting right in the center.

But what if your vector has an even number of elements, like in the next example?

quarterly_sales <- c(10, 4, 7, 2)

median(quarterly_sales)

Here, R sorts to (2, 4, 7, 10) and then averages the two middle numbers (4 and 7), yielding 5.5.

If your dataset contains an odd number of values, it simply picks the one in the center. For an even number of values, median() calculates the average of the two middle numbers.

So under the hood, median() does the sorting for you, which is very convenient.

Handling NA Values with median()

As your datasets grow, it’s common to encounter missing values. These can trip up your calculations if you’re not careful. By default, median() will return NA if any missing values are present in your vector. (Any missing values at all.) Let’s see how this plays out:

monthly_sales <- c(6, 3, NA, 9)

median(monthly_sales)

Notice that the result is NA. To sidestep this, add the na.rm = TRUE argument. This tells R to remove missing values before calculating the median:

median(monthly_sales, na.rm = TRUE)

This time, you’ll get 6 (the median of the remaining values (3, 6, and 9)). Keep an eye out for those NAs, as they can sneak into your data and throw off your analysis if you forget them.

Finding the Median by Group in R

Let's try something new: comparing medians across different groups.

A common question: What if we want to see the median income by region? Just like we did with simple vectors, R makes this easy, but now we’ll need to group our data first.

One quick way in base R is with tapply():

income_values <- c(40000, 50000, 45000, 35000, 60000, 30000)
region_labels <- c("East", "West", "West", "East", "East", "West")

tapply(income_values, region_labels, median)

This computes the median income for each region.

If you prefer working with the dplyr package, you can achieve the same result using pipe-friendly syntax:

library(dplyr)

income_data <- data.frame(income = income_values, region = region_labels)

income_data %>%
   group_by(region) %>%
   summarise(median_income = median(income))

This approach produces a tidy summary table, listing the median income for each region.

Grouped summaries like this are especially useful when you want to compare central tendencies across categories in your data.

Calculating the Median for Data Frame Columns in R

Sometimes, you might want to get the median for every column in a data frame at once—maybe for a quick scan of your dataset’s central values. Here I will use sapply().

test_scores <- data.frame(math = c(1, 2, 3), science = c(7, 8, 9))

sapply(test_scores, median)

This command returns the median of each column, so you can spot trends or outliers at a glance. This technique is especially handy when you’re exploring new data and want a fast overview.

Things to Watch Out For

Before we wrap up, let’s highlight a few common pitfalls and best practices to ensure smooth sailing with median():

  • Check that you’re only passing numeric data to median(). If you try a factor or character vector, R will throw an error.

  • Remember to use na.rm = TRUE if you suspect missing data.

  • No need to sort your data manually. median() takes care of sorting internally, so just pass your vector as-is.

Conclusion

The median() function is a great way to understand the true center of your data, especially when your dataset contains outliers that can distort the mean. The median() function is also flexible enough to handle complexities like missing values or grouped summaries.

For next steps, I also wrote a short article on the R mean() function, if you want to take a look. And remember to take our Exploratory Data Analysis in R course to really build all the important useful job skills.


Josef Waples's photo
Author
Josef Waples

I'm a data science writer and editor with contributions to research articles in scientific journals. I'm especially interested in linear algebra, statistics, R, and the like. I also play a fair amount of chess! 

Topics

Learn R with DataCamp

Course

Introduction to R

4 hr
2.9M
Master the basics of data analysis in R, including vectors, lists, and data frames, and practice R with real data sets.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Tutorial

R mean() Function: Get Started with Averages

Calculate the average of numeric, logical, and weighted data using R’s built-in mean functions. Understand how to handle missing values and apply the function to vectors and data frames.
Josef Waples's photo

Josef Waples

4 min

Tutorial

Mean vs. Median: Knowing the Difference

Explore the differences between mean and median, learn their applications in data analysis, and know how to choose the right measure for different scenarios.
Samuel Shaibu's photo

Samuel Shaibu

8 min

Tutorial

Rank Formula in Excel: A Comprehensive Guide With Examples

Learn how to rank data in Excel with RANK(), RANK.EQ(), and RANK.AVG() functions. Understand their differences, applications, and tips for accurate data analysis.
Laiba Siddiqui's photo

Laiba Siddiqui

11 min

Tutorial

R Formula Tutorial

Discover the R formula and how you can use it in modeling- and graphical functions of well-known packages such as stats, and ggplot2.
Karlijn Willems's photo

Karlijn Willems

15 min

Tutorial

How to Use na.rm to Handle Missing Values in R

We set na.rm = TRUE in common R functions to exclude missing (NA) values. This helps us compute accurate statistics and enhances the reliability of our results.
Vidhi Chugh's photo

Vidhi Chugh

8 min

Tutorial

T-tests in R Tutorial: Learn How to Conduct T-Tests

Determine if there is a significant difference between the means of the two groups using t.test() in R.
Abid Ali Awan's photo

Abid Ali Awan

10 min

See MoreSee More