Home PythonIntroduction to Natural Language Processing in Python

Introduction to Natural Language Processing in Python

Name: Introduction to Natural Language Processing in Python
Rating: 4.027027 (37 reviews)

4.0+

Intermediate

Learn fundamental natural language processing techniques using Python and how to apply them to extract insights from real-world text data.

Start Course for Free

4 Hours15 Videos51 Exercises

114,597 LearnersStatement of Accomplishment

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Training 2 or more people?Try DataCamp For Business

Loved by learners at thousands of companies

Course Description

In this course, you'll learn natural language processing (NLP) basics, such as how to identify and separate words, how to extract topics in a text, and how to build your own fake news classifier. You'll also learn how to use basic libraries such as NLTK, alongside libraries which utilize deep learning to solve common NLP problems. This course will give you the foundation to process and parse text as you move forward in your Python learning.

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist with Python

Go To Track

Natural Language Processing in Python

Go To Track

1
Regular expressions & word tokenization
Free
This chapter will introduce some basic NLP concepts, such as word tokenization and regular expressions to help parse text. You'll also learn how to handle non-English text and more difficult tokenization you might find.
Play Chapter Now
Introduction to regular expressions
50 xp
Which pattern?
50 xp
Practicing regular expressions: re.split() and re.findall()
100 xp
Introduction to tokenization
50 xp
Word tokenization with NLTK
100 xp
More regex with re.search()
100 xp
Advanced tokenization with NLTK and regex
50 xp
Choosing a tokenizer
50 xp
Regex with NLTK tokenization
100 xp
Non-ascii tokenization
100 xp
Charting word length with NLTK
50 xp
Charting practice
100 xp
2
Simple topic identification
This chapter will introduce you to topic identification, which you can apply to any text you encounter in the wild. Using basic NLP models, you will identify topics from texts based on term frequencies. You'll experiment and compare two simple methods: bag-of-words and Tf-idf using NLTK, and a new library Gensim.
Play Chapter Now
Word counts with bag-of-words
50 xp
Bag-of-words picker
50 xp
Building a Counter with bag-of-words
100 xp
Simple text preprocessing
50 xp
Text preprocessing steps
50 xp
Text preprocessing practice
100 xp
Introduction to gensim
50 xp
What are word vectors?
50 xp
Creating and querying a corpus with gensim
100 xp
Gensim bag-of-words
100 xp
Tf-idf with gensim
50 xp
What is tf-idf?
50 xp
Tf-idf with Wikipedia
100 xp
3
Named-entity recognition
This chapter will introduce a slightly more advanced topic: named-entity recognition. You'll learn how to identify the who, what, and where of your texts using pre-trained models on English and non-English text. You'll also learn how to use some new libraries, polyglot and spaCy, to add to your NLP toolbox.
Play Chapter Now
Named Entity Recognition
50 xp
NER with NLTK
100 xp
Charting practice
100 xp
Stanford library with NLTK
50 xp
Introduction to SpaCy
50 xp
Comparing NLTK with spaCy NER
100 xp
spaCy NER Categories
50 xp
Multilingual NER with polyglot
50 xp
French NER with polyglot I
100 xp
French NER with polyglot II
100 xp
Spanish NER with polyglot
100 xp
4
Building a "fake news" classifier
You'll apply the basics of what you've learned along with some supervised machine learning to build a "fake news" detector. You'll begin by learning the basics of supervised machine learning, and then move forward by choosing a few important features and testing ideas to identify and classify fake news articles.
Play Chapter Now
Classifying fake news using supervised learning with NLP
50 xp
Which possible features?
50 xp
Training and testing
50 xp
Building word count vectors with scikit-learn
50 xp
CountVectorizer for text classification
100 xp
TfidfVectorizer for text classification
100 xp
Inspecting the vectors
100 xp
Training and testing a classification model with scikit-learn
50 xp
Text classification models
50 xp
Training and testing the "fake news" model with CountVectorizer
100 xp
Training and testing the "fake news" model with TfidfVectorizer
100 xp
Simple NLP, complex problems
50 xp
Improving the model
50 xp
Improving your model
100 xp
Inspecting your model
100 xp

For Business

Training 2 or more people?

Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more

In the following Tracks

Machine Learning Scientist with Python

Go To Track

Natural Language Processing in Python

Go To Track

Datasets

English stopwords Monty Python and the Holy Grail News articles Wikipedia articles

Collaborators

Hugo Bowne-Anderson

Yashas Roy

Prerequisites

Python Data Science Toolbox (Part 2)

Katharine Jarmul

Founder, kjamistan

Katharine Jarmul runs a data analysis company called kjamistan that specializes in helping companies analyze data and training others on data analysis best practices, particularly with Python. She has been using Python for 8 years for a variety of data work -- including telling stories at major national newspapers, building large scale aggregation software, making decisions based on customer analytics, and marketing spend and advising new ventures on the competitive landscape.

Don’t just take our word for it

*4.0

from 37 reviews

41%

30%

24%

Sort by

Highest to Lowest
Lowest to Highest
Most recent
Top reviews

VIGNESH A.

7 months

Really good exercises to learn in depth about NLP

Mrinal B.

about 1 year

Excellent course for the beginner

Alexandr C.

about 1 year

I liked this course and topics was covered in it. Tutor has nice speed and voice, and topics was excellent explained and followed with good examples. I also found few unknown and useful libraries for making NLP preprocessing simpler and faster.

Ezequiel A.

about 1 year

The course requires no prior experience in NLP. Great introduction into this field, touching main methods and Python libraries.

Dierk P.

about 1 year

Very clear speaker, enjoyed presentations.

"Really good exercises to learn in depth about NLP"

VIGNESH A.

"Excellent course for the beginner"

Mrinal B.

"I liked this course and topics was covered in it. Tutor has nice speed and voice, and topics was excellent explained and followed with good examples. I also found few unknown and useful libraries for making NLP preprocessing simpler and faster."

Alexandr C.

Join over 13 million learners and start Introduction to Natural Language Processing in Python today!

Create Your Free Account

Google LinkedIn Facebook

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Course Description

.css-1goj2uy{margin-right:8px;}Group.css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;}.css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192D;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;}Training 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Natural Language Processing in Python

Regular expressions & word tokenization

Simple topic identification

Named-entity recognition

Building a "fake news" classifier

GroupTraining 2 or more people?

In the following Tracks

Machine Learning Scientist with Python

Natural Language Processing in Python

Don’t just take our word for it

Join over .css-ou6dz6{color:#03ef62;}13 million learners and start Introduction to Natural Language Processing in Python today!

Create Your Free Account

Training 2 or more people?

Training 2 or more people?

Join over 13 million learners and start Introduction to Natural Language Processing in Python today!