Skip to main content

Creating Robust Workflows in Python

Learn to develop a set of principles for your data science and software development projects.

Start Course for Free
4 Hours16 Videos47 Exercises4,578 Learners3900 XP

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

The decisions we make in life are guided by our principles. No one is born with a life philosophy, instead everyone creates their own over time. In this course, you will develop a set of principles for your data science and software development projects. These principles will save time, prevent frustration, and build your confidence as a data scientist and software developer. In addition to best practices in the Python programming language, You will learn to leverage hidden gems in the Python standard library and well-known tools from Python's excellent ecosystem, such as pandas and scikit-learn. The time you invest in this course will yield dividends for you and others throughout your career. Your colleagues, community members, and future self will thank you.

  1. 1

    Python Programming Principles


    In this chapter, we will discuss three principles that guide decisions made by Python programmers. You will put these principles into practice in the coding exercises and throughout the rest of the course!

    Play Chapter Now
    Don't repeat yourself
    50 xp
    Functions and iteration
    100 xp
    Find matches
    100 xp
    Dataset dimensions
    100 xp
    50 xp
    Extract words
    100 xp
    Most frequent words
    100 xp
    50 xp
    Instance method
    100 xp
    Class method
    100 xp
  2. 2

    Documentation and Tests

    Documentation and tests are often overlooked, despite being essential to the success of all projects. In this chapter, you will learn how to include documentation in our code and practice Test-Driven Development (TDD), a process that puts tests first!

    Play Chapter Now
  3. 3

    Shell superpowers

    Shell scripting is an essential part of any Python workflow. In this chapter, you will learn how to build command-line interfaces (CLIs) for Python programs and to automate common tasks related to version control, virtual environments, and Python packaging.

    Play Chapter Now
  4. 4

    Projects, pipelines, and parallelism

    In the final chapter of this course, you will learn how to facilitate and standardize project setup using project templates. You will also consider the benefits of zipped executable projects, Jupyter notebooks parameterization, and parallel computing.

    Play Chapter Now


chesterChester Ismaysara-billenSara Billen


Python Data Science Toolbox (Part 2)Supervised Learning with scikit-learn
Martin Skarzynski Headshot

Martin Skarzynski

Co-Chair, Foundation for Advanced Education in the Sciences (FAES)

Martin Skarzynski is passionate about Bioinformatics, Data Science, Epidemiology, and Statistical Computing. Martin is Co-Chair of the Bioinformatics and Data Science Department at the Foundation for the Advancement of Education in the Sciences (FAES), where he has been an instructor since 2015 and currently teaches Introduction to Python (BIOF309) and Applied Machine Learning (BIOF509). Martin is also an instructor for Software and Data Carpentry, non-profit organizations that teach computational skills. As a National Cancer Institute (NCI) Cancer Prevention Fellow, Martin uses the Python and R programming languages and command line tools to explore, analyze, visualize and present data. Martin has a strong interest in reproducibility, scientific publishing workflows, and open data/science best practices and is excited to apply his computational skills in combination with his science background to the study and prevention of cancer.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA