Skip to main content

Free Course

Web Scraping in Python

IntermediateSkill Level

4.7+

Updated 03/2026

Learn to retrieve and parse information from the internet using the Python library scrapy.

Start Free Course

Included for Free

PythonData Preparation

4 hr

17 videos

56 Exercises

4,500 XP

92,876

Statement of Accomplishment

Loved by learners at thousands of companies

Training a Team?

Try for Business

Course Description

The ability to build tools capable of retrieving and parsing information stored across the internet has been and continues to be valuable in many veins of data science. In this course, you will learn to navigate and parse html code, and build tools to crawl websites automatically. Although our scraping will be conducted using the versatile Python library scrapy, many of the techniques you learn in this course can be applied to other popular Python libraries as well, including BeautifulSoup and Selenium. Upon the completion of this course, you will have a strong mental model of html structure, will be able to build tools to parse html code and access desired information, and create a simple scrapy spiders to crawl the web at scale.

Prerequisites

Intermediate Python

1

Introduction to HTML

Learn the structure of HTML. We begin by explaining why web scraping can be a valuable addition to your data science toolbox and then delving into some basics of HTML. We end the chapter by giving a brief introduction on XPath notation, which is used to navigate the elements within HTML code.

Web Scraping Overview

Web-scraping is not nonsense!

HyperText Markup Language

HTML tree wordy navigation

From Tree to HTML

Keep it Classy

Finding href

Crash Course in XPath

Where am I?

It's Time to P

A classy span

2

XPaths and Selectors

Leverage XPath syntax to explore scrapy selectors. Both of these concepts will move you towards being able to scrape an HTML document.

Counting Elements in the Wild

Body Appendages

Choose DataCamp!

Off the Beaten XPath

Where it's @

Check your Class

Hyper(link) Active

Secret Links

Selector Objects

XPath Chaining

Divvy Up This Exercise

The Source of the Source

Course Class by Inspection

Requesting a Selector

3

CSS Locators, Chaining, and Responses

Learn CSS Locator syntax and begin playing with the idea of chaining together CSS Locators with XPath. We also introduce Response objects, which behave like Selectors but give us extra tools to mobilize our scraping efforts across multiple websites.

From XPath to CSS

The (X)Path to CSS Locators

Get an "a" in this Course

The CSS Wildcard

CSS Attributes and Text Selection

You've been `href`ed

Top Level Text

All Level Text

Respond Please!

Reveal By Response

Responding with Selectors

Selecting from a Selection

Scraping with Children

4

Spiders

Learn to create web crawlers with scrapy. These scrapy spiders will crawl the web through multiple pages, following links to scrape each of those pages automatically according to the procedures we've learned in the previous chapters.

Your First Spider

Inheriting the Spider

Hurl the URLs

Start Requests

Self Referencing is Classy

Starting with Start Requests

Parse and Crawl

Crawler Time

Time to Run

DataCamp Descriptions

Capstone Crawler

Web Scraping in Python

Course
Complete

Earn Statement of Accomplishment

Add this credential to your LinkedIn profile, resume, or CV
Share it on social media and in your performance reviewEnroll Now

Don’t just take our word for it

*4.7

from 1,004 reviews

82%

16%

2%

0%

0%

Sort by

Camilly

19 minutes ago

Daanial

yesterday

Nice

Maria Clara

yesterday

Maria Clara

2 days ago

Claudia Elizabeth

3 days ago

amirmohammad

3 days ago

Camilly

"Nice"

Daanial

Maria Clara

FAQs

Is this course suitable for beginners?

Yes, this course is great for beginners! It covers the basics of HTML structure and XPath notation and then progresses to more advanced topics such as chaining selectors and crawling multiple pages with Scrapy.

Could I work on my own projects in this course?

Yes, this course is designed to help you develop the skills to scrape any website you choose with Python. You’ll be able to build your own tools to retrieve and parse information across the internet.

Will I receive a certificate at the end of the course?

Yes, upon completion of the course, you will receive a DataCamp certificate.

Who will benefit from this course?

This course is beneficial for anyone looking to develop web scraping skills using Python. This skill is particularly useful for data analysts, data scientists, web developers, machine learning engineers, and more.

What topics will I learn?

You will learn the structure of HTML, XPath syntax and selectors, CSS Locators and chaining, Response objects, and how to create web crawlers with scrapy.

Will I learn how to create a scrapy spider?

Yes, you will learn how to create scrapy spiders that can crawl websites through multiple pages, following links to scrape each page automatically.

Join over 19 million learners and start Web Scraping in Python today!

Grow your data skills with DataCamp for Mobile

Make progress on the go with our mobile courses and daily 5-minute coding challenges.