Skip to content

Streamlined Data Ingestion with pandas

Run the hidden code cell below to import the packages used in this course.


1 hidden cell

Take Notes

PART 1: IMPORTING DATA FROM FLAT FILE

One of the ways of representing data is DataFrame. DataFrame is pandas special structure which was developed for two dimensional data. While it is possible to create dataframe on your own from scratch, the often done work is importing data of different types from several sources. Add notes about the concepts you've learned and code cells with code you want to keep.

FLAT FILES

Flatfiles are data which is represented by text and seperated by either comma, tab etc. It can be uploaded into any python plaform by usind read_csv() from pandas package.

Add your notes here

# Add your code snippets here

Explore Datasets

Try using the prompt below to explore the data and practice your skills!

There are three data files in the datasets/ directory of varying kinds: data.db (NYC weather and 311 housing complaints), fcc-new-coder-survey.xlsx ( FreeCodeCamp New Developer Survey response subset), and vt_tax_data_2016.csv (Vermont tax return data by ZIP code).

Import each of these files into a format useful for data analysis in Python, such as specifying data types, handling bad or missing data, and parsing dates. You can also practice importing a specific portion of data from the data files (e.g., certain sheet(s) of an Excel worksheet or using SQL to filter a database on conditions).