Importing Data in Python

Start Course for Free
5 Hours21 Videos82 Exercises7,352 Learners
6500 XP

Create Your Free Account

GoogleLinkedInFacebook
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA. You confirm you are at least 16 years old (13 if you are an authorized Classrooms user).

Loved by learners at thousands of companies


Course Description

As a Data Scientist, on a daily basis you will need to clean data, wrangle and munge it, visualize it, build predictive models and interpret these models. Before doing any of these, however, you will need to know how to get data into Python. In this course, you'll learn the many ways to import data into Python: (i) from flat files such as .txts and .csvs; (ii) from files native to other software such as Excel spreadsheets, Stata, SAS and MATLAB files; (iii) from relational databases such as SQLite & PostgreSQL; (iv) from the web and (v) a special and essential case of this: pulling data from Application Programming Interfaces, also known as APIs, such as the Twitter streaming API, which allows us to stream real-time tweets.

  1. 1

    Introduction and flat files

    Free
    In this chapter, you'll learn how to import data into Python from all types of flat files, a simple and prevalent form of data storage. You've previously learned how to use NumPy and Pandas - you will learn how to use these packages to import flat files, as well as how to customize your imports.
    Play Chapter Now
  2. 2

    Importing data from other file types

    You've learned how to import flat files, but there are many other file types you will potentially have to work with as a data scientist. In this chapter, you'll learn how to import data into Python from a wide array of important file types. You will be importing file types such as pickled files, Excel spreadsheets, SAS and Stata files, HDF5 files, a file type for storing large quantities of numerical data, and MATLAB files.
    Play Chapter Now
  3. 3

    Working with relational databases in Python

    In this chapter, you'll learn how to extract meaningful data from relational databases, an essential element of any data scientist's toolkit. You will be learning about the relational model, creating SQL queries, filtering and ordering your SQL records, and advanced querying by JOINing database tables.
    Play Chapter Now
  4. 4

    Importing data from the Internet

    The web is a rich source of data from which you can extract various types of insights and findings. In this chapter, you will learn how to get data from the web, whether it be stored in files or in HTML. You'll also learn the basics of scraping and parsing web data.
    Play Chapter Now
  5. 5

    Interacting with APIs to import data from the web

    In this chapter, you will push further on your knowledge of importing data from the web. You will learn the basics of extracting data from APIs, gain insight on the importance of APIs and practice getting data from them with dives into the OMDB, Wikipedia and Twitter APIs.
    Play Chapter Now
Hugo Bowne-Anderson Headshot

Hugo Bowne-Anderson

Data Scientist at DataCamp
Hugo is a data scientist, educator, writer and podcaster at DataCamp. His main interests are promoting data & AI literacy, helping to spread data skills through organizations and society and doing amateur stand up comedy in NYC. If you want to know what he likes to talk about, definitely check out DataFramed, the DataCamp podcast, which he hosts and produces: https://www.datacamp.com/community/podcast
See More
Hugo Bowne-Anderson Headshot

Hugo Bowne-Anderson

Data Scientist at DataCamp
Hugo hearts all things Pythonic and is charged with building out DataCamp’s Python curriculum. He can be found at hackathons, meetups & code sprints, primarily in NYC. Before joining the ranks of DataCamp, he worked in applied mathematics (biology) research at Yale University.
See More

What do other learners have to say?

I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.

Devon Edwards Joseph
Lloyds Banking Group

DataCamp is the top resource I recommend for learning data science.

Louis Maiden
Harvard Business School

DataCamp is by far my favorite website to learn from.

Ronald Bowers
Decision Science Analytics, USAA