Classify Song Genres from Audio Data
Classify Song Genres from Audio Data

Rock or rap? Apply machine learning methods in Python to classify songs into genres.

Using a dataset comprised of songs of two music genres (Hip-Hop and Rock), you will train a classifier to distinguish between the two genres based only on track information derived from Echonest (now part of Spotify). You will first make use of pandas and seaborn packages in Python for subsetting the data, aggregating information, and creating plots when exploring the data for obvious trends or factors you should be aware of when doing machine learning.

Next, you will use the scikit-learn package to predict whether you can correctly classify a song's genre based on features such as danceability, energy, acousticness, tempo, etc. You will go over implementations of common algorithms such as PCA, logistic regression, decision trees, and so forth.

  1. 1
    Preparing our dataset
  2. 2
    Pairwise relationships between continuous variables
  3. 3
    Splitting our data
  4. 4
    Normalizing the feature data
  5. 5
    Principal Component Analysis on our scaled data
  6. 6
    Further visualization of PCA
  7. 7
    Projecting on to our features
  8. 8
    Train a decision tree to classify genre
  9. 9
    Compare our decision tree to a logistic regression
  10. 10
    Balance our data for greater performance
  11. 11
    Does balancing our dataset improve model bias?
  12. 12
    Using cross-validation to evaluate our models


Lina Tran HeadshotLina Tran

PhD Candidate at University of Toronto

Lina studies learning and memory in the Frankland Lab at the Hospital for Sick Children/University of Toronto.
Joel Östblom HeadshotJoel Östblom

PhD Candidate at University of Toronto

Joel is a PhD student in Biomedical Engineering at the University of Toronto, where he uses computational and experimental approaches to better understand fundamental stem cell decisions. Outside school, he enjoys playing ice hockey, eating and making food, being in nature, and figuring out how he can maximize the time he spends inside vim.
Ahmed Hasan HeadshotAhmed Hasan

PhD Candidate at University of Toronto

Ahmed Hasan is a PhD student in the Department of Cell and Systems Biology at the University of Toronto. An active user of both R and Python, his research focuses on understanding how genetic recombination affects how genomes evolve.
