Project Description

Apply data-wrangling and visualization tools from the tidyverse to musical data. Find the most common chords and chord progressions in a sample of pop/rock music from the 1950s-1990s, and compare the styles of different artists. This project assumes familiarity with standard TidyVerse tools for R, in particular the tibble data structure and the dplyr and ggplot2 packages. No specific musical knowledge is required, though it may give you ideas for further exploration of the dataset after completing the project.

Before taking on this project we recommend that you have completed the following courses:

This project uses a parsed and cleaned version of the McGill Billboard Dataset, version 2.0 (CC0 license).

Project Tasks

  • 1 Introduction
  • 2 The most common chords
  • 3 Visualizing the most common chords
  • 4 Chord "bigrams"
  • 5 Visualizing the most common chord progressions
  • 6 Finding the most common artists
  • 7 Tagging the corpus
  • 8 Comparing chords in piano-driven and guitar-driven songs
  • 9 Comparing chord bigrams in piano-driven and guitar-driven songs
  • 10 Conclusion
Kris Shaffer
Kris Shaffer

Data scientist and instructional technology specialist

Kris Shaffer, Ph.D., is a data scientist and instructional technology specialist at the University of Mary Washington. He also does freelance work in web intelligence and analytics, and has an academic background in music theory and the digital humanities. You can find him on the web at

See More
  • R
    Icon Icon Icon professional info