Skip to main content
HomePython

project

Exploring the Evolution of Linux

Beginner
Updated 06/2024
Find out about the evolution of the Linux operating system by exploring its version control system.
Start Project for Free

Included withPremium or Teams

9 Tasks1,500 XP10,793

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
Group

Training 2 or more people?

Try DataCamp for Business

Project Description

Version control repositories like CVS, Subversion or Git store rich evolution information about a software project. In this project, you'll be challenged to read in, clean up and visualize a real world Git repository dataset of the Linux kernel. With almost 700k commits and thousands of contributors (find out the exact number in this project ;-) ) there are some little data cleaning and wrangling challenges that you'll encounter. But you'll also gain insights about the development activities over the last 13 years.

For this Project, you need to be familiar with Pandas DataFrames, especially the read_csv and groupby functions, as well as working with time series data.

Project Tasks

  1. 1
    Introduction
  2. 2
    Reading in the dataset
  3. 3
    Getting an overview
  4. 4
    Finding the TOP 10 contributors
  5. 5
    Wrangling the data
  6. 6
    Treating wrong timestamps
  7. 7
    Grouping commits per year
  8. 8
    Visualizing the history of Linux
  9. 9
    Conclusion

Technologies

Python Python

Prerequisites

Intermediate PythonManipulating Time Series Data in PythonData Manipulation with pandas
Markus Harrer HeadshotMarkus Harrer

Software Development Analyst

See More

FAQs

What do other learners have to say?