Competition

Cleaning data and the skies
Your are a data analyst at an environmental company. Your task is to evaluate ozone pollution across various regions. You’ve obtained data from the U.S. Environmental Protection Agency (EPA), containing daily ozone measurements at monitoring stations across California. However, like many real-world datasets, it’s far from clean: there are missing values, inconsistent formats, potential duplicates, and outliers. Before you can provide meaningful insights, you must clean and validate the data. Only then can you analyze it to uncover trends, identify high-risk regions, and assess where policy interventions are most urgently needed.
Prize
$500 GIFT CARD
Challenge yourself and try this competition for free
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Your challenge
- Your EDA and data cleaning process.
- How does daily maximum 8-hour ozone concentration vary over time and regions?
- Are there any areas that consistently show high ozone concentrations? Do different methods report different ozone levels?
- Consider if urban activity (weekend vs. weekday) has any affect on ozone levels across different days.
- Bonus: plot a geospatial heatmap showing any high ozone concentrations.
Prizes
1st
$500 or a donation to a charitable cause of your choice
2nd
$400 or a donation to a charitable cause of your choice
3rd
$300 or a donation to a charitable cause of your choice
4th
$200 or a donation to a charitable cause of your choice
5th
$200 or a donation to a charitable cause of your choice
How to get started
Create your most insightful analysis using DataLab, our in-browser tool to write, run, and publish data analyses. Once you’ve finished your work, you’ll need to publish it for review.
Judging criteria
Recommendations (35%)
- Clarity of recommendations - how clear and well presented the recommendation is.
- Quality of recommendations - are appropriate analytical techniques used & are the conclusions valid?
- Quality of the executive summary.
Storytelling (35%)
- How well the data and insights are connected to the recommendation.
- How the narrative and whole report connects together.
- Balancing making the report in-depth enough but also concise.
Visualizations (20%), if applicable
- Appropriateness of visualization used.
- Clarity of insight from visualization.
Public upvotes (10%)
- Upvoting - most upvoted entries get the most points.
Rules
- Entries to the competition take the form of a workbook publication. Make sure the competition publication is publicly visible in order to be entered into the competition.
- Your publication should be focused on data provided within the competition.
- The competition is open and free to registered DataCamp users.
- Only one entry per user. You may update your entry up to the deadline.
- Make sure your competition workbook is published by the competition deadline in order for it to be valid.
- You can check the time left to submit on the counter at the top of this page.
Note: Please make sure you're 18+ years old and are allowed to take part in a skill-based competition from your country.