Skip to main content
Paul Love avatar

Paul Love has completed

Parallel Computing in R

Start course For Free
4 hours
4,250 XP
Statement of Accomplishment Badge

Loved by learners at thousands of companies


Course Description

With an increasing amount of data and more complex algorithms available to scientists and practitioners today, parallel processing is almost always a must, and in fact, is expected in packages implementing time-consuming methods. This course introduces you to concepts and tools available in R for parallel computing and provides solutions to a few important non-trivial issues in parallel processing like reproducibility, generating random numbers and load balancing.
For Business

GroupTraining 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more
Try DataCamp for BusinessFor a bespoke solution book a demo.
  1. 1

    Can I Run My Application in Parallel?

    Free

    In order to take advantage of parallel environment, the application needs to be split into pieces. In this introductory chapter, you will learn about different ways of partitioning and how it fits different hardware configurations. You will also be introduced to various R packages that support parallel programming.

    Play Chapter Now
    Partitioning problems into independent pieces
    50 xp
    Partitioning demographic model
    50 xp
    Partitioning probabilistic demographic model
    50 xp
    Find the most frequent words in a text
    100 xp
    Models of parallel computing
    50 xp
    A simple embarrassingly parallel application
    100 xp
    Probabilistic projection of migration (setup)
    50 xp
    Probabilistic projection of migration
    100 xp
    R packages for parallel computing
    50 xp
    Passing arguments via clusterApply()
    100 xp
    Sum in parallel
    100 xp
    More tasks than workers
    100 xp
  2. 2

    The parallel Package

    This chapter will dive deeper into the parallel package. You'll learn about the various backends and their differences and get a deep understanding about the workhorse of the package, namely the clusterApply() function. Strategies for task segmentation including their pitfalls will also be discussed.

    Play Chapter Now
  3. 3

    foreach, future.apply and Load Balancing

    In this chapter, you will look at two user-contributed packages, namely foreach and future.apply, which make parallel programming in R even easier. They are built on top of the parallel and future packages. In the last lesson of this chapter, you will learn about the advantages and pitfalls of load balancing and scheduling.

    Play Chapter Now
  4. 4

    Random Numbers and Reproducibility

    Now you might ask, can I reproduce my results if the application uses random numbers? Can I generate the same results regardless of if the code runs sequentially or in parallel? This chapter will answer these questions. You will learn about a random number generator well suited to a parallel environment and how the various packages make use of it.

    Play Chapter Now

Datasets

Words (Jane Austen's 6 books)US migrationSOTU (2016)

Collaborators

Collaborator's avatar
Benjamin Feder
Hana Sevcikova HeadshotHana Sevcikova

Senior Research Scientist, University of Washington

See More

Join over 13 million learners and start Parallel Computing in R today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.