Premium Project

Gender Bias in Graduate Admissions

Analyze admissions data from UC Berkeley and find out if the university was biased against women.

Start Project
  • 9 tasks
  • 694 participants
  • 1,500 XP

Project Description

If you were a man applying to Berkeley's graduate school in 1973, you were almost twice as likely to be admitted as your female peers. On the surface, this seems to have been a flagrant case of gender discrimination. However, a closer inspection of the data reveals that women were more likely to apply to departments where the admission rate was lower overall, which was the true reason for any difference between the sexes.

The Berkeley problem is a classic example of Simpson's paradox – an important concept in statistics where an effect disappears or even reverses when you control for other factors. Knowledge of this concept can prove critical in areas such as education policy, human resources, or any other field where bias or discrimination is a concern.

Students should have a knowledge of common data structures in R, as taught through DataCamp's Introduction to R course, as well as some understanding of logistic regression, as taught through Multiple and Logistic Regression. Finally, they should have experience with the tidyverse suite of packages, particularly dplyr and ggplot2, which can be acquired in Introduction to the Tidyverse.

Project Tasks

  • 1Welcome to Berkeley
  • 2Acceptance rate for men and women
  • 3Visualizing the discrepancy
  • 4Acceptance rate by department
  • 5Alternative explanations
  • 6Binary logistic regression: part i
  • 7Binary logistic regression: part ii
  • 8Behold Simpson's paradox
  • 9Bias or discrimination?
Joshua Feldman

Data Scientist at BBC

Joshua Feldman is a data scientist at the BBC, where he uses a host of machine learning techniques to answer business problems and help the organization better understand its audiences. He mainly codes in R and SQL, taking a specialist interest in computational text analysis and data visualization. He holds an MSc in quantitative research methodology from the London School of Economics.

See More


  • R LogoR
  • Topics

    Data ManipulationData VisualizationProbability & StatisticsImporting & Cleaning Data