Premium Project

Planning Public Policy in Argentina

Apply unsupervised learning techniques to help plan an education program in Argentina.

Start Project
  • 12 tasks
  • 883 participants
  • 1,500 XP

Project Description

As statistics and machine learning methods become more pervasive, policymakers have also increased their use of these techniques when deciding how to allocate public resources. In this project, you will analyze economic and social development indicators for 22 provinces of Argentina to help plan an education program.

This project assumes you can manipulate data frames using dplyr, make plots using ggplot2, and understand principal component analysis (PCA) and k-means clustering. You can learn these skills from DataCamp's Introduction to the Tidyverse and Unsupervised Learning in R.

The data for this project was published by INDEC and includes indicators such as poverty, population, and GDP for each province.

Project Tasks

  • 1Provinces of Argentina
  • 2Most populous, richest provinces
  • 3A matrix for PCA
  • 4Reducing dimensions
  • 5PCA: Variables & Components
  • 6Plotting the components
  • 7Cluster using K means
  • 8Components with colors
  • 9Buenos Aires, in a league of its own
  • 10The rich provinces
  • 11The poor provinces
  • 12Planning for public policy
Rafael La Buonora

Data Scientist at Transforma Uruguay

With a background in economics, Rafael works as a data scientist at Transforma Uruguay, assessing the impact of public policy in the welfare of the citizens of Uruguay.

See More


  • R LogoR
  • Topics

    Data ManipulationData VisualizationMachine LearningImporting & Cleaning Data