Interactive Course

Natural Language Processing Fundamentals in Python

Learn fundamental natural language processing techniques using Python and how to apply them to extract insights from real-world text data.

  • 4 hours
  • 15 Videos
  • 51 Exercises
  • 40,334 Participants
  • 3,750 XP

Loved by learners at thousands of top companies:

whole-foods-grey.svg
siemens-grey.svg
intel-grey.svg
uber-grey.svg
paypal-grey.svg
3m-grey.svg

Course Description

In this course, you'll learn natural language processing (NLP) basics, such as how to identify and separate words, how to extract topics in a text, and how to build your own fake news classifier. You'll also learn how to use basic libraries such as NLTK, alongside libraries which utilize deep learning to solve common NLP problems. This course will give you the foundation to process and parse text as you move forward in your Python learning.

  1. Simple topic identification

    This chapter will introduce you to topic identification, which you can apply to any text you encounter in the wild. Using basic NLP models, you will identify topics from texts based on term frequencies. You'll experiment and compare two simple methods: bag-of-words and Tf-idf using NLTK, and a new library Gensim.

  2. Building a "fake news" classifier

    You'll apply the basics of what you've learned along with some supervised machine learning to build a "fake news" detector. You'll begin by learning the basics of supervised machine learning, and then move forward by choosing a few important features and testing ideas to identify and classify fake news articles.

  1. 1

    Regular expressions & word tokenization

    Free

    This chapter will introduce some basic NLP concepts, such as word tokenization and regular expressions to help parse text. You'll also learn how to handle non-English text and more difficult tokenization you might find.

  2. Simple topic identification

    This chapter will introduce you to topic identification, which you can apply to any text you encounter in the wild. Using basic NLP models, you will identify topics from texts based on term frequencies. You'll experiment and compare two simple methods: bag-of-words and Tf-idf using NLTK, and a new library Gensim.

  3. Named-entity recognition

    This chapter will introduce a slightly more advanced topic: named-entity recognition. You'll learn how to identify the who, what, and where of your texts using pre-trained models on English and non-English text. You'll also learn how to use some new libraries, polyglot and spaCy, to add to your NLP toolbox.

  4. Building a "fake news" classifier

    You'll apply the basics of what you've learned along with some supervised machine learning to build a "fake news" detector. You'll begin by learning the basics of supervised machine learning, and then move forward by choosing a few important features and testing ideas to identify and classify fake news articles.

What do other learners have to say?

Devon

“I've used other sites, but DataCamp's been the one that I've stuck with.”

Devon Edwards Joseph

Lloyd's Banking Group

Louis

“DataCamp is the top resource I recommend for learning data science.”

Louis Maiden

Harvard Business School

Ronbowers

“DataCamp is by far my favorite website to learn from.”

Ronald Bowers

Decision Science Analytics @ USAA

Katharine Jarmul
Katharine Jarmul

Founder, kjamistan

Katharine Jarmul runs a data analysis company called kjamistan that specializes in helping companies analyze data and training others on data analysis best practices, particularly with Python. She has been using Python for 8 years for a variety of data work -- including telling stories at major national newspapers, building large scale aggregation software, making decisions based on customer analytics, and marketing spend and advising new ventures on the competitive landscape.

See More
Icon Icon Icon professional info