Skip to content
strutting python files
What info is in the file? pandas dictionary:
- .shape() = how big is the data
- .info() = what different variables and types are present
- .tail() = see the tail end of the data
- .head() = see the columns and initial values
- .describe()= summary stats on numerical values
- df.[Catergorical].describe() = Catergorical= titanic_train.dtypes[titanic_train.dtypes=='object'].index
- list(df.columns)= list columns
- df.dtypes = type of data in integer, float, object, etc
import pandas as pd
df= pd.read_excel("videogamesales.xlsx")
print(df.columns)
dfdf_info=df.info()
print(df_info)
print(df.describe()['Rank'])#calling for two columns of interest
grouped_df =df.groupby(['Publisher']).max()
print(grouped_df['Genre'])# interested in specific values
print(df[df['Year'] >2008])
# calculating median, mean, mode
import pandas as pd
import numpy as np
df2= pd.read_excel('videogamesales.xlsx')
df2_median= np.nanmedian(df2['EU_Sales'])
print(df2_median)df3=pd.read_excel('videogamesales.xlsx')
df3_mean=np.nanmean(df3['EU_Sales'])
print(df3_mean)Reshape pandas DataFrame using pivot and melt
import pandas as pd
df= pd.read_excel("pivot_data_1.xlsx")
print(df)
df1=df.pivot(index="date", columns="name", values="sales")
print(df1)import pandas as pd
df2=pd.read_excel("melt_data_1.xlsx")
print(df2)
df3=df2.melt(id_vars='name')
print(df3)
print(df3)
df4=df3.drop('value', axis =1)
print(df4)SQL Tutorial - JOINS
DataFrameas
df5
variable
SELECT * FROM all_weeks_countries;
DataFrameas
df
variable
SELECT * FROM all_weeks_global