Official Blog

Live-Coding Recap: Kaggle Competition with Machine Learning

Recap post outlining all the resources from the second live-coding session!

We've had an amazing turn out for the second FB live-coding event! Thanks to all that joined Hugo Bowne-Anderson in December in submitting several submissions to Kaggle's infamous Titanic Machine Learning Competition!. For those that missed it, here is a recap to find all the useful links and notebooks to take you from zero to one with machine learning in Python.

This live code-along session covers how to build an algorithm that predicts whether any given passenger on the Titanic survived or not, given data on them such as the fare they paid, where they embarked and their age. Hugo shows you how to do so using the Python programming language, Jupyter notebooks and state-of-the-art packages such as pandas, scikit-learn and seaborn. Dive into this rich dataset and build your chops in exploratory data analysis, data munging and cleaning, and machine learning. No previous experience with machine learning necessary, just a will to learn!

This code along session is meant for beginners and intermediates alike. Some programming fundamentals and Python basics will help though (e.g., variables, for loops). Hugo uses Jupyter Notebooks and the terminal. However, if you’re not super familiar with these tools, never fear! We'll link all the resources you need to complete this project.

You'll learn how to put all of these tools together to produce informative figures such as this:

Feel free to code-along or just watch Hugo do his thing! For those that do wish to code please follow the instructions to set up before starting the video. You’ll need, amongst other things, to clone the repo and download the Anaconda Distribution for Python 3.6 if you haven’t already.

Now you are ready to watch Hugo and follow along. The video lasts about 1 hour and 40 minutes. You can skip to the 10th minute mark where Hugo comes on and the session actually starts.

You can find 3 detailed tutorials showing the various submissions with the solutions here:

Can you improve on Hugo's submissions? Let us know how high you can rank on Kaggle!

Help us make these sessions better by sharing feedback on Twitter to @DataCamp & @hugobowne.

See you in the next live coding session!