Skip to content
Data Manipulation with pandas
Run the hidden code cell below to import the data used in this course.
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
import pandas as pd
walmart = pd.read_csv('datasets/walmart.csv', index_col= 0)
print(walmart.head())
print(walmart.describe())
print(walmart.shape)
print(walmart.info())
print(walmart.index)
print(walmart.columns)
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Print the highest weekly sales for each
department
in thewalmart
DataFrame. Limit your results to the top five departments, in descending order. If you're stuck, try reviewing this video. - What was the total
nb_sold
of organic avocados in 2017 in theavocado
DataFrame? If you're stuck, try reviewing this video. - Create a bar plot of the total number of homeless people by region in the
homelessness
DataFrame. Order the bars in descending order. Bonus: create a horizontal bar chart. If you're stuck, try reviewing this video. - Create a line plot with two lines representing the temperatures in Toronto and Rome. Make sure to properly label your plot. Bonus: add a legend for the two lines. If you're stuck, try reviewing this video.
import pandas as pd
homelessness = pd.read_csv('datasets/homelessness.csv', index_col=0)
#data exploration
print("exploración de data")
print(homelessness.head())
print(homelessness.info())
print(homelessness.columns)
print(homelessness.index)
#asignation
print("""
-----------------------------
Ordenar por una sola variable
-----------------------------
""")
homelessness_ind = homelessness.sort_values('individuals')
print(homelessness_ind)
print("""
--------------------------------------------
Ordenar por dos variables cambiando el orden
--------------------------------------------
""")
homelessness_reg_fam = homelessness.sort_values(['region','family_members'],ascending=[True,False])
print(homelessness_reg_fam)
print("""
--------------------------------------------
subset 1 columna
--------------------------------------------
""")
individuals = homelessness['individuals']
print(individuals)
print("""
----------------------
""")
state = homelessness['state']
print(state)
print("""
--------------------------------------------
subset 2 columnas
--------------------------------------------
""")
ind_state = homelessness[['individuals','state']]
print(ind_state)
print("""
--------------------------------------------
filtrado por valores..
--------------------------------------------
""")
ind_state = homelessness[['individuals','state']]
print(ind_state)
import pandas as pd
homelessness = pd.read_csv('datasets/homelessness.csv')
#asignation
print("""
-----------------------------
Filtrar por cantidad
-----------------------------
""")
ind_gt_10k = homelessness['individuals'] > 10000
print(ind_gt_10k)
print("""
-----------------------------
Filtrar por nombre de campo
-----------------------------
""")
alabama_reg = homelessness[homelessness["state"]=="Alabama"]
print(alabama_reg)
print(homelessness.info())
print("""
-----------------------------
Filtrar por nombre y cantidad
-----------------------------
""")
fam_lt_1k_pac = homelessness[(homelessness['family_members'] < 10000) & (homelessness['region'] == 'Pacific')]
print(fam_lt_1k_pac)
print("""
-----------------------------
Filtrar por dos variables del mismo campo isin
-----------------------------
""")
south_mid_atlantic_isin = homelessness.isin([['South Atlantic','Mid-Atlantic']])
print(south_mid_atlantic_isin)
print("""
-----------------------------
Filtrar por dos variables del mismo campo método |
-----------------------------
""")
south_mid_atlantic = homelessness[(homelessness['region']=='South Atlantic')|(homelessness['region']=='Mid-Atlantic')]
print(south_mid_atlantic)
print("""
-----------------------------
Filtrar por lista
-----------------------------
""")
canu = ["California", "Arizona", "Nevada", "Utah"]
mojave_homelessness_isin2 = homelessness[(homelessness['state'].isin(canu))]
print(mojave_homelessness_isin2)