Skip to main content
HomeR

Course

Intermediate Regular Expressions in R

IntermediateSkill Level
4.9+
31 reviews
Updated 11/2024
Manipulate text data, analyze it and more by mastering regular expressions and string distances in R.
Start Course for Free
RProgramming4 hr14 videos48 Exercises3,650 XP4,697Statement of Accomplishment

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Loved by learners at thousands of companies

Group

Training 2 or more people?

Try DataCamp for Business

Course Description

Analyzing data that comes in tables is fun. But what if the things that we find most interesting are not available as a neatly organized dataset but in plain text? Do not despair: In this course, you'll learn everything you need to know to create powerful regular expressions that will help you find all the information you need for your analyses from just a blob of text. But not only that. Using the concept of string distances, you will learn to work even with text that contains typos or scanning errors, as you will be able to match them to their correct counterparts from other data sources (record linkage). As a learning material, we will analyze real documents about box office figures in Swiss cinemas.

Prerequisites

Introduction to the TidyverseString Manipulation with stringr in R
1

Regular Expressions: Writing Custom Patterns

Regular expressions can be pretty intimidating at first as they contain vast amounts of special characters. In this chapter, you'll learn to decipher these and write your own patterns to find exactly what you're looking for.
Start Chapter
2

Creating Strings with Data

3

Extracting Structured Data From Text

4

Similarities Between Strings

In the last chapter, we will shift gears away from regular expressions to understanding string distances. By calculating the differences of multiple strings, we can match those that are similar. This will help us to find duplicates even when they contain small errors like typos. This is an important part to record linkage where we combine datasets from multiple sources.
Start Chapter
Intermediate Regular Expressions in R
Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance review
Enroll Now

Don’t just take our word for it

*4.9
from 31 reviews
90%
10%
0%
0%
0%
  • Alex
    5 weeks ago

  • Błażej
    3 months ago

  • Thomas
    3 months ago

  • Julia
    3 months ago

  • Njemile
    3 months ago

  • Maksym
    3 months ago

Alex

Błażej

Thomas

FAQs

What makes this course different from a beginner regex course?

This intermediate course goes beyond basic pattern matching to cover extracting structured data from plain text, building strings programmatically, and matching similar strings using string distances.

What real-world data is used in this course?

You analyze real documents about box office figures in Swiss cinemas, learning to extract and structure information from messy text sources.

Does the course cover record linkage and fuzzy matching?

Yes. Chapter 4 teaches string distance calculations to match similar strings, even those containing typos or scanning errors, which is essential for combining datasets from multiple sources.

What R packages are used in this course?

You use stringr for string manipulation along with R regex capabilities. The course builds on skills from the String Manipulation with stringr prerequisite course.

What should I already know about regular expressions before enrolling?

You should be comfortable with basic regex patterns and stringr functions. The prerequisites include String Manipulation with stringr in R and intermediate R knowledge.

Join over 19 million learners and start Intermediate Regular Expressions in R today!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.