Skip to content
Introduction to Importing Data in Python
  • AI Chat
  • Code
  • Report
  • Introduction to Importing Data in Python

    Run the hidden code cell below to import the data used in this course.

    # Import the course packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.io
    import h5py
    from sas7bdat import SAS7BDAT
    from sqlalchemy import create_engine
    import pickle
    
    # Import the course datasets
    titanic = pd.read_csv("datasets/titanic_sub.csv")
    battledeath_2002 = pd.ExcelFile("datasets/battledeath.xlsx").parse("2002")
    engine = create_engine('sqlite:///datasets/Chinook.sqlite')
    con = engine.connect()
    rs = con.execute('SELECT * FROM Album')
    chinook = pd.DataFrame(rs.fetchall())
    seaslug = np.loadtxt("datasets/seaslug.txt", delimiter="\t", dtype=str)

    Explore Datasets

    Try importing the remaining files to explore the data and practice your skills!

    • datasets/disarea.dta
    • datasets/ja_data2.mat
    • datasets/L-L1_LOSC_4_V1-1126259446-32.hdf5
    • datasets/mnist_kaggle_some_rows.csv
    • datasets/sales.sas7bdat

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Reading a text file

    filename = 'huck_finn.txt'
    file = open(filename, mode='r')  # 'r' is to read
    text = file.read()
    file.close()
    
    Hidden output

    Print a text file

    print(text)

    Context manager with

    with open('huck_finn.txt', 'r') as file:
        print(file.read())
        
    # Read & print the first 3 lines
    with open('moby_dick.txt') as file:
        print(file.readline())
        print(file.readline())
        print(file.readline())

    Importing flat files using NumPy (only numerical data)

    import numpy as np
    filename = 'MNIST.txt'
    data = np.loadtxt(filename, delimiter=',')
    data

    Customizing your NumPy import

    import numpy as np 
    filename = 'MNIST_header.txt'
    data = np.loadtxt(filename, delimiter=',', skiprows=1)
    print(data)
    import numpy as np
    filename = 'MNIST_header.txt'
    data = np.loadtxt(filename, delimiter=',', skiprows=1, usecols=[0, 2])
    print(data)