Home PythonWeb Scraping in Python

Web Scraping in Python

Name: Web Scraping in Python
Rating: 4.357143 (42 reviews)

4.3+

42 reviews

Intermediate

Learn to retrieve and parse information from the internet using the Python library scrapy.

Start Course for Free

4 Hours17 Videos56 Exercises

73,305 LearnersStatement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies

Course Description

The ability to build tools capable of retrieving and parsing information stored across the internet has been and continues to be valuable in many veins of data science. In this course, you will learn to navigate and parse html code, and build tools to crawl websites automatically. Although our scraping will be conducted using the versatile Python library scrapy, many of the techniques you learn in this course can be applied to other popular Python libraries as well, including BeautifulSoup and Selenium. Upon the completion of this course, you will have a strong mental model of html structure, will be able to build tools to parse html code and access desired information, and create a simple scrapy spiders to crawl the web at scale.

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Python Developer

Go To Track

1
Introduction to HTML
Free
Learn the structure of HTML. We begin by explaining why web scraping can be a valuable addition to your data science toolbox and then delving into some basics of HTML. We end the chapter by giving a brief introduction on XPath notation, which is used to navigate the elements within HTML code.
Play Chapter Now
Web Scraping Overview
50 xp
Web-scraping is not nonsense!
50 xp
HyperText Markup Language
50 xp
HTML tree wordy navigation
50 xp
From Tree to HTML
100 xp
Attributes
50 xp
Keep it Classy
100 xp
Finding href
50 xp
Crash Course in XPath
50 xp
Where am I?
100 xp
It's Time to P
100 xp
A classy span
100 xp
2
XPaths and Selectors
Leverage XPath syntax to explore scrapy selectors. Both of these concepts will move you towards being able to scrape an HTML document.
Play Chapter Now
XPathology
50 xp
Counting Elements in the Wild
50 xp
Body Appendages
100 xp
Choose DataCamp!
100 xp
Off the Beaten XPath
50 xp
Where it's @
100 xp
Check your Class
100 xp
Hyper(link) Active
100 xp
Secret Links
100 xp
Selector Objects
50 xp
XPath Chaining
100 xp
Divvy Up This Exercise
100 xp
The Source of the Source
50 xp
Course Class by Inspection
50 xp
Requesting a Selector
100 xp
3
CSS Locators, Chaining, and Responses
Learn CSS Locator syntax and begin playing with the idea of chaining together CSS Locators with XPath. We also introduce Response objects, which behave like Selectors but give us extra tools to mobilize our scraping efforts across multiple websites.
Play Chapter Now
From XPath to CSS
50 xp
The (X)Path to CSS Locators
100 xp
Get an "a" in this Course
100 xp
The CSS Wildcard
100 xp
CSS Attributes and Text Selection
50 xp
You've been `href`ed
100 xp
Top Level Text
100 xp
All Level Text
100 xp
Respond Please!
50 xp
Reveal By Response
100 xp
Responding with Selectors
100 xp
Selecting from a Selection
100 xp
Survey
50 xp
Titular
100 xp
Scraping with Children
100 xp
4
Spiders
Learn to create web crawlers with scrapy. These scrapy spiders will crawl the web through multiple pages, following links to scrape each of those pages automatically according to the procedures we've learned in the previous chapters.
Play Chapter Now
Your First Spider
50 xp
Inheriting the Spider
100 xp
Hurl the URLs
100 xp
Start Requests
50 xp
Self Referencing is Classy
100 xp
Starting with Start Requests
100 xp
Parse and Crawl
50 xp
Pen Names
100 xp
Crawler Time
100 xp
Capstone
50 xp
Time to Run
100 xp
DataCamp Descriptions
100 xp
Capstone Crawler
100 xp
The Finale
50 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Python Developer

Go To Track

Datasets

DataCamp webpage HTML

Collaborators

David Campos

Mari Nazary

Shon Inouye

Prerequisites

Intermediate Python

Thomas Laetsch

Data Scientist at New York University

Since January 2016, Thomas Laetsch has been a Moore-Sloan Post-Doctoral Associate in the Center for Data Science at NYU. In 2012, he received his PhD in mathematics from the University of California, San Diego, specializing in probability, differential geometry, and functional analysis. From 2012 through 2015, he was a Visiting Assistant Professor at the University of Connecticut, working on central tendency theorems for random walks in degenerate spaces.

Don’t just take our word for it

*4.3

from 42 reviews

64%

14%

17%

Sort by

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

Mahdi B.

6 months

I really liked the course I dint know how to make a web crawler and i liked the logic to make it with scrapy.Spider

Ben M.

10 months

Great course.

Khalid Y.

10 months

Thanks DataCamp, for providing this course and the way you designed it. All along i felt that my tuters of this course were teaching me guiding me personally. Thanks again for this great skill.

CHUN L.

about 1 year

Kareem N.

about 1 year

Very good

"I really liked the course I dint know how to make a web crawler and i liked the logic to make it with scrapy.Spider"

Mahdi B.

"Great course."

Ben M.

"Thanks DataCamp, for providing this course and the way you designed it. All along i felt that my tuters of this course were teaching me guiding me personally. Thanks again for this great skill."

Khalid Y.

FAQs

Join over 13 million learners and start Web Scraping in Python today!

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Web Scraping in Python

Create Your Free Account

Loved by learners at thousands of companies

Course Description

Training 2 or more people?

In the following Tracks

Python Developer

Introduction to HTML

XPaths and Selectors

CSS Locators, Chaining, and Responses

Spiders

Training 2 or more people?

In the following Tracks

Python Developer

Don’t just take our word for it

FAQs

Is this course suitable for beginners?

Could I work on my own projects in this course?

Will I receive a certificate at the end of the course?

Who will benefit from this course?

What topics will I learn?

Will I learn how to create a scrapy spider?

Join over 13 million learners and start Web Scraping in Python today!

Create Your Free Account

Course Description

.css-1goj2uy{margin-right:8px;}Group.css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;}.css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192D;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;}Training 2 or more people?

In the following Tracks

Python Developer

Introduction to HTML

XPaths and Selectors

CSS Locators, Chaining, and Responses

Spiders

GroupTraining 2 or more people?

In the following Tracks

Python Developer

Don’t just take our word for it

FAQs

Will I receive a certificate at the end of the course?

Who will benefit from this course?

What topics will I learn?

Will I learn how to create a scrapy spider?

Join over .css-ou6dz6{color:#03ef62;}13 million learners and start Web Scraping in Python today!

Create Your Free Account

Training 2 or more people?

Training 2 or more people?

Join over 13 million learners and start Web Scraping in Python today!