Premium Project

What Makes a Pokémon Legendary?

Use tree-based machine learning methods to identify the characteristics of legendary Pokémon.

Start Project
  • 11 tasks
  • 1,081 participants
  • 1,500 XP

Project Description

Not all Pokémon are created equal. Some are consigned to mediocrity, useless in battle until they reach their more evolved states. Others – like Zapdos, Articuno and Moltres – are so unique and powerful that they have officially been classified as legendary.

But what exactly makes a Pokémon the stuff of legend? In this project, we will answer that question with the help of a dataset that includes the base stats, height, weight and type of 801 Pokémon from all seven generations. Using the random forest algorithm, we will predict Pokemon status based on these characteristics and rank their importance in determining whether a Pokemon is classified as legendary.

Students should be familiar with the tidyverse suite of packages, particularly ggplot2 for data visualization and dplyr for data manipulation. They should also have experience with classification problems and tree-based methods, as taught through Supervised Learning in R: Classification and Machine Learning with Tree-Based Models in R.

This project uses a subset of The Complete Pokemon Dataset published on Kaggle.

Project Tasks

  • 1Introduction
  • 2How many Pokémon are legendary?
  • 3Legendary Pokémon by height and weight
  • 4Legendary Pokémon by type
  • 5Legendary Pokémon by fighter stats
  • 6Create a training/test split
  • 7Fit a decision tree
  • 8Fit a random forest
  • 9Assess model fit
  • 10Analyze variable importance
  • 11Conclusion
Instructor Avatar
Joshua Feldman

Data Scientist at BBC

Joshua Feldman is a data scientist at the BBC, where he uses a host of machine learning techniques to answer business problems and help the organization better understand its audiences. He mainly codes in R and SQL, taking a specialist interest in computational text analysis and data visualization. He holds an MSc in quantitative research methodology from the London School of Economics.

See More

Technology

  • R LogoR
  • Topics

    Data ManipulationData VisualizationMachine LearningImporting & Cleaning Data