Skip to content

Introduction to Importing Data in Python

Run the hidden code cell below to import the data used in this course.

# Import the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.io
import h5py
from sas7bdat import SAS7BDAT
from sqlalchemy import create_engine
import pickle

# Import the course datasets
titanic = pd.read_csv("datasets/titanic_sub.csv")
battledeath_2002 = pd.ExcelFile("datasets/battledeath.xlsx").parse("2002")
engine = create_engine('sqlite:///datasets/Chinook.sqlite')
con = engine.connect()
rs = con.execute('SELECT * FROM Album')
chinook = pd.DataFrame(rs.fetchall())
seaslug = np.loadtxt("datasets/seaslug.txt", delimiter="\t", dtype=str)

Explore Datasets

Try importing the remaining files to explore the data and practice your skills!

  • datasets/disarea.dta
  • datasets/ja_data2.mat
  • datasets/L-L1_LOSC_4_V1-1126259446-32.hdf5
  • datasets/mnist_kaggle_some_rows.csv
  • datasets/sales.sas7bdat

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Reading a text file

filename = 'huck_finn.txt'
file = open(filename, mode='r')  # 'r' is to read
text = file.read()
file.close()
Hidden output

Print a text file

print(text)

Context manager with

with open('huck_finn.txt', 'r') as file:
    print(file.read())
    
# Read & print the first 3 lines
with open('moby_dick.txt') as file:
    print(file.readline())
    print(file.readline())
    print(file.readline())

Importing flat files using NumPy (only numerical data)

import numpy as np
filename = 'MNIST.txt'
data = np.loadtxt(filename, delimiter=',')
data

Customizing your NumPy import

import numpy as np 
filename = 'MNIST_header.txt'
data = np.loadtxt(filename, delimiter=',', skiprows=1)
print(data)
import numpy as np
filename = 'MNIST_header.txt'
data = np.loadtxt(filename, delimiter=',', skiprows=1, usecols=[0, 2])
print(data)