Skip to content
Python Data Science Toolbox (Part 1)
Python Data Science Toolbox (Part 1)
Run the hidden code cell below to import the data used in this course.
# Import the course packages
import pandas as pd
from functools import reduce
# Import the dataset
tweets = pd.read_csv('datasets/tweets.csv')Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Define count_entries()
def count_entries(df, col_name='lang'):
"""Return a dictionary with counts of
occurrences as value for each key."""
# Initialize an empty dictionary: cols_count
cols_count = {}
# Extract column from DataFrame: col
col = df[col_name]
# Iterate over the column in DataFrame
for entry in col:
# If entry is in cols_count, add 1
if entry in cols_count.keys():
cols_count[entry] += 1
# Else add the entry to cols_count, set the value to 1
else:
cols_count[entry] = 1
# Return the cols_count dictionary
return cols_count
# Call count_entries(): result1
result1 = count_entries(tweets_df)
# Call count_entries(): result2
result2 = count_entries(tweets_df,'source')
# Print result1 and result2
print(result1)
print(result2)Explore Datasets
Use the DataFrame imported in the first cell to explore the data and practice your skills!
- Write a function that takes a timestamp (see column
timestamp_ms) and returns the text of any tweet published at that timestamp. Additionally, make it so that users can pass column names as flexible arguments (*args) so that the function can print out any other columns users want to see. - In a
filter()call, write a lambda function to return tweets created on a Tuesday. Tip: look at the first three characters of thecreated_atcolumn. - Make sure to add error handling on the functions you've created!