Skip to main content
HomePython

Introduction to Natural Language Processing in Python

4.0+
38 reviews
Intermediate

Learn fundamental natural language processing techniques using Python and how to apply them to extract insights from real-world text data.

Start Course for Free
4 hours15 videos51 exercises125,347 learnersTrophyStatement of Accomplishment

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.
Group

Training 2 or more people?

Try DataCamp for Business

Loved by learners at thousands of companies


Course Description

In this course, you'll learn natural language processing (NLP) basics, such as how to identify and separate words, how to extract topics in a text, and how to build your own fake news classifier. You'll also learn how to use basic libraries such as NLTK, alongside libraries which utilize deep learning to solve common NLP problems. This course will give you the foundation to process and parse text as you move forward in your Python learning.
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.
DataCamp for BusinessFor a bespoke solution book a demo.

In the following Tracks

Machine Learning Scientist in Python

Go To Track

Natural Language Processing in Python

Go To Track
  1. 1

    Regular expressions & word tokenization

    Free

    This chapter will introduce some basic NLP concepts, such as word tokenization and regular expressions to help parse text. You'll also learn how to handle non-English text and more difficult tokenization you might find.

    Play Chapter Now
    Introduction to regular expressions
    50 xp
    Which pattern?
    50 xp
    Practicing regular expressions: re.split() and re.findall()
    100 xp
    Introduction to tokenization
    50 xp
    Word tokenization with NLTK
    100 xp
    More regex with re.search()
    100 xp
    Advanced tokenization with NLTK and regex
    50 xp
    Choosing a tokenizer
    50 xp
    Regex with NLTK tokenization
    100 xp
    Non-ascii tokenization
    100 xp
    Charting word length with NLTK
    50 xp
    Charting practice
    100 xp
  2. 2

    Simple topic identification

    This chapter will introduce you to topic identification, which you can apply to any text you encounter in the wild. Using basic NLP models, you will identify topics from texts based on term frequencies. You'll experiment and compare two simple methods: bag-of-words and Tf-idf using NLTK, and a new library Gensim.

    Play Chapter Now
  3. 3

    Named-entity recognition

    This chapter will introduce a slightly more advanced topic: named-entity recognition. You'll learn how to identify the who, what, and where of your texts using pre-trained models on English and non-English text. You'll also learn how to use some new libraries, polyglot and spaCy, to add to your NLP toolbox.

    Play Chapter Now
For Business

Training 2 or more people?

Get your team access to the full DataCamp platform, including all the features.

In the following Tracks

Machine Learning Scientist in Python

Go To Track

Natural Language Processing in Python

Go To Track

datasets

English stopwordsMonty Python and the Holy GrailNews articlesWikipedia articles

collaborators

Collaborator's avatar
Hugo Bowne-Anderson
Collaborator's avatar
Yashas Roy

prerequisites

Python Toolbox
Katharine Jarmul HeadshotKatharine Jarmul

Founder, kjamistan

Katharine Jarmul runs a data analysis company called kjamistan that specializes in helping companies analyze data and training others on data analysis best practices, particularly with Python. She has been using Python for 8 years for a variety of data work -- including telling stories at major national newspapers, building large scale aggregation software, making decisions based on customer analytics, and marketing spend and advising new ventures on the competitive landscape.
See More

Don’t just take our word for it

*4.0
from 38 reviews
42%
29%
24%
3%
3%
  • Li D.
    29 days

    Great

  • Shahedha S.
    9 months

    I completed the RNN course first by chance and expected something similar. That might have been why I had a snippey attitude while completing the course but it was worth it to push myself. The course builds you up with a collection of skills and knowledge and then it all comes together very nicely in the last chapter. Be a patient student and you will be rewarded.

  • Mallick M.
    10 months

    It is an excellent course. I loved how the AI helped track the indention and syntax errors until I started being really careful about it.

  • Anne M.
    11 months

    Best Datacamp course I have done in a while. Really engaging instructor, very clear explanations and instructions. The course got me even more interested in the topic and I felt like I learned a lot. Most importantly, at all times during the course, I knew why I was doing what I was doing during the exercises.

  • Larissa S.
    about 1 year

    The instructior did a great job explaining the concepts and the tasks were absolutely manageable. Towards the end, I however, would’ve wished more explanation why the model was chosen this way etc

"Great"

Li D.

"It is an excellent course. I loved how the AI helped track the indention and syntax errors until I started being really careful about it."

Mallick M.

"Best Datacamp course I have done in a while. Really engaging instructor, very clear explanations and instructions. The course got me even more interested in the topic and I felt like I learned a lot. Most importantly, at all times during the course, I knew why I was doing what I was doing during the exercises."

Anne M.

Join over 15 million learners and start Introduction to Natural Language Processing in Python today!

Create Your Free Account

GoogleLinkedInFacebook

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.