Skip to main content

Spoken Language Processing in Python

Learn to load, transform, and transcribe human speech from raw audio files in Python.

Start Course for Free
4 Hours14 Videos53 Exercises4,261 Learners4400 XPNatural Language Processing Track

Create Your Free Account



By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies

Course Description

We learn to speak far before we learn to read. Even in the digital age, our main method of communication is speech. Spoken Language Processing with Python will help you load, transform and transcribe audio files. You'll start by seeing what raw audio looks like in Python. And then finish by working through an example business use case, transcribing and classifying phone call data.

  1. 1

    Introduction to Spoken Language Processing with Python


    Audio files are different from most other types of data. Before you can start working with them, they require some preprocessing. In this chapter, you'll learn the first steps to working with speech files by converting two different audio files into soundwaves and comparing them visually.

    Play Chapter Now
    Introduction to audio data in Python
    50 xp
    The right frequency
    50 xp
    Importing an audio file with Python
    100 xp
    Converting sound wave bytes to integers
    50 xp
    The right data type
    50 xp
    Bytes to integers
    100 xp
    Finding the time stamps
    100 xp
    Visualizing sound waves
    50 xp
    Staying consistent
    50 xp
    Processing audio data with Python
    100 xp
  2. 2

    Using the Python SpeechRecognition library

    Speech recognition is still far from perfect. But the SpeechRecognition library provides an easy way to interact with many speech-to-text APIs. In this section, you'll learn how to use the SpeechRecognition library to easily start converting the spoken language in your audio files to text.

    Play Chapter Now
  3. 3

    Manipulating Audio Files with PyDub

    Not all audio files come in the same shape, size or format. Luckily, the PyDub library by James Robert provides tools which you can use to programmatically alter and change different audio file attributes such as frame rate, number of channels, file format and more. In this chapter, you'll learn how to use this helpful library to ensure all of your audio files are in the right shape for transcription.

    Play Chapter Now

In the following tracks

Natural Language Processing


maggiematsuiMaggie MatsuiadriansotoAdrián Sotohillary-green-lermanHillary Green-Lerman
Daniel Bourke Headshot

Daniel Bourke

Machine Learning Engineer and YouTube creator

Machine Learning Engineer who creates YouTube videos and writes about the intersection of health, technology and art.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA