Why have a portfolio project?
Finding the time and motivation to work on personal projects can be challenging. However, whether you’re in a full-time job, self-employed, or looking for work, balancing your professional life with your passion for building data science projects can certainly be rewarding. Below are some key reasons why you should invest your time and effort in building a data science portfolio. If you want to read more about why it’s important to create portfolio projects and best practices and examples for creating them, make sure to read this article.
Building and honing skills
Learning to code, building models, improving model accuracy, and deploying models are all part of the data science workflow. These skills are in high demand, and creating portfolio projects can be a great way to hone your skills and strengthen your knowledge in your areas of interest. Moreover, portfolio projects allow you to build skills that don’t fully align with your background, job, or specialty. If you specialize in natural language processing applications, building computer vision side projects will bring your skills to the next level. The possibilities are endless.
Showcasing your experience to recruiters
Imagine that two junior data scientists arrive in front of a recruiter; the first one says, "I know Python, Machine Learning, and MLOps," but the second one says, "I know all that too, and I applied my knowledge in this project where I scraped data, applied a machine learning model on it, and deployed it as a web-app." It’s clear which candidate will stand out the most. Portfolio projects can establish your legitimacy as a Data Scientist. The more varied your portfolio, the more you can showcase a wide variety of technical skills that you can speak about with recruiters and hiring managers.
Demonstrating your soft skills
Creating a data science project portfolio showcases consistency, persistence, attention to detail, and a willingness to learn and improve continuously. These soft skills are vital in many career fields, and data science is no exception. More importantly, if you supplement technical portfolio projects with content-based projects, you’ll be able to showcase your communication and data storytelling skills, which will further set you apart as a data scientist.
Taking your first step towards entrepreneurship
Between a side project and entrepreneurship, there is only one step to take, and that is to embark on the adventure full time. Numerous adventures became Million-Dollar Startups. Moreover, portfolio projects are also a great way to get started on becoming a freelance data scientist. To learn more about becoming a freelance data scientist, read part 1 and part 2 of our guide on becoming a freelance data scientist.
5 places to host your data science portfolio
Sharing your projects with the data science community can contribute to the general knowledge base, invite collaboration, help build your brand, and get you involved in a larger conversation. That's why it's good to communicate about your projects and make sure they are visible to the greatest number of people. There are plenty of choices for hosting your data science portfolio, but these are some of the best tools and platforms that will help you display your portfolio online.
1. Datacamp Workspace
Datacamp Workspace is a collaborative cloud-based notebook that allows you to analyze data, collaborate with others, and publish analysis instantly. Workspace enables you to write code, analyze data, and share your data insights just from your browser. It offers 20+ preloaded datasets for you to analyze in addition to pre-written code examples through playbook templates. Workspace supports R, Python, and SQL, and is available on any operating system. It requires zero installation and zero downloads. After creating your projects, you can share the link to your DataCamp profile so that people can instantly access them. Some examples of high-quality projects hosted on Workspace are
Kaggle is an online community platform for data scientists and machine learning enthusiasts. It allows you to collaborate with other data scientists, find and publish datasets, publish notebooks, and compete with other data scientists to solve data science challenges. There are many datasets available for those who want to implement their algorithms. The advantage of this platform is that the data is relatively well structured and cleaned. It is, therefore, a great place to start to get a feel for working on data science projects. After registering, you can browse through the different competitions in progress in several categories:
- Long-lasting competitions for beginners are a great resource to get you started. You can apply your knowledge and use them to practice what you learned.
- Time-limited competitions for swag or fame are one step above beginner difficulty.
- Time-limited competitions with prizes can be more challenging. They are usually sponsored by external organizers such as Netflix, Google, and more.
Joining these competitions is a great way to develop, improve your skills and grow your technical abilities. To showcase your work, you'll need to have a notebook (Kernel) that explains in detail the ins and outs of your project so that as many people as possible can understand it.
As the platform is known for its high number of participants, it might not seem easy for a beginner to win prizes. However, participating in competitions and publishing notebooks allows you to develop your skills, accumulate points and thus climb in the ranks. It is easy to imagine that reaching the ultimate rank of Grandmaster on Kaggle would open doors in your career as a data scientist. You can read this complete guide about Kaggle for more information. Some great notebooks hosted on Kaggle can be found below:
At a high level, GitHub is a website and cloud service that enables developers to store and manage their code repositories and track and monitor changes to them. To understand what GitHub is, you need to know two related principles: version control and Git, which help you record changes to your projects over time to recall specific versions later. You can check out this guide to learn more about Git. The platform allows users to collaborate on or publish open-source projects, fork and share code, and track issues. Setting up a GitHub account and hosting your portfolio using GitHub pages is easy and free. Just follow these steps:
- Create a GitHub account.
- Learn how to use Git and GitHub. You can find details explanations and a tutorial on understanding Git and Github by following this tutorial or the Introduction to Git course.
- Upload your site to GitHub pages by replicating these steps:
- Open GitHub and create a new public repository name username.github.io, where username is your username (or organization name) on GitHub.
- If you are not familiar with Git commands, you can simply download GitHub desktop to use Git and GitHub on macOS and Windows.
- After you finish the installation, go to GitHub.com and refresh the page. Click the "Set up in Desktop" button. When the GitHub desktop app opens, save the project.
- Open your text editor, and create an index.html for your project.
- Commit your changes and press publish.
- Give your website a theme—A bootstrap theme or an HTML/CSS template will work but a WordPress theme will not.
An effective way to relay your projects is to use platforms like Github. After creating your Github account, you can start posting your projects there. In Github, each of your projects must have a
README.md file that your users can easily read. This is often something that is forgotten among coders and yet is crucial. If you don't have a
README.md, it's much harder for the reader to understand what the project is about. Below are examples GitHub pages portfolio projects
4. Personal Website
Having a blog or a personal website is also an excellent way to centralize your projects, especially since it’s relatively straightforward to set up a website without spending a huge budget. If you decide to go this route, WordPress is a great place to start, though another CMS like Strikingly or Wix will do the job just fine. While it can be hard to get eyes on your project compared to hosting it on a site like DataCamp Workspace or Kaggle, hosting your site allows for more control and customization. Plus, if you work hard to optimize your SEO, you can appear quite high in Google searches.
5. Medium (and social networks)
It is important to communicate about your projects as much as possible. For content-based portfolio projects, there are blogging platforms you can use in addition to your own personal website. Medium is one of the best platforms to reach a wider audience with your projects. Moreover, posting on social networks such as Quora, LinkedIn, Twitter, and Reddit, can help solidify your legitimacy as a data scientist and enable your projects to gain more visibility.
Having a solid data science portfolio can be a game-changer. It's a chance to acquire and learn new capabilities and leverage and improve existing ones. Pursuing portfolio projects can enable you to build out new skills, gain recruiters’ attention, and possibly generate potential sources of income by helping you start your freelance journey. Showcasing projects you've worked on to recruiters will differentiate you from other data scientists, so spend some time honing your portfolio, as the return on investment is definitely worth the effort. For more on portfolio projects and breaking into data science, check out the following resources: