Skip to content
import pandas as pd


df= pd.read_csv('pokemon_data.csv')

#print (df.tail(3))

SQL (Importing a Dataframe and CSV file)

Spinner
DataFrameas
df
variable
select *
from df
limit 100

Importing pandas and files (CSV, xlsx and txt file)

###https://www.youtube.com/watch?v=vmEHCJofslg&list=LL

#another function of pandas is that you can insert xlsx files


df_xlsx = pd.read_excel('pokemon_data.xlsx')
df_xlsx
# lastly, you can add text files but you need to add a delimiter function '\t' to get columns as seen as in the CSV.

df_text=pd.read_csv('pokemon_data.txt', delimiter='\t')
df_text

Reading Data in Pandas

## Read headers
#df.columns

## Read a specific column
#df[['Name','Type 1', 'Speed']]

## Reach each row
#df.head()
#df.iloc[1]
#df.iloc[0:4]

##Read a specific location, integer based (R,C): Venusaur 
#df.iloc[2,1]

##Read a specific location, non-integer based (R,C)
#df.loc[df['Type 1']]

##to specify for ALL outputs of columns/row use function:
#for index, row in df.iterrows():
#print(index, row[['Name','Type 1']])
   
for index, row in df.iterrows():
    print(index, row['Name'])

##Read a specific location, non-integer based (R,C)
df.head()

#conditional statements, for this location function, I am only interested in values where there is Fire in Type 1
df.loc[df['Type 1'] =='Grass']

Sorting and describing the data

##quick stats (count, mean, std, min, 25%, 50%, 75%, max)
df.describe()

##I am only interested in a specific column ('Name') and I would like them descending. In order to get them descending, I need to attribute a False statement to ascending

##descending values for 'Name' column
#df.sort_values('Name', ascending=False)

##ascending values for 'Name' column
df.sort_values('Name', ascending=True)

Making changes to our data (adding columns/ rearranging and dropping columns)

## helps visualize our headers/columns
#df.head(5)

##add a column
#df['Total']= df['HP'] + df['Attack']+ df['Sp. Atk']
#df.head(5)

##summing multiple columns to create a new column
#axis=1: add horizontally, axis=0: add vertically
df['Total']= df.iloc[:,4:10].sum(axis=1)
#df.head()


##double checking our values are good - always check! 
#45+49+49+65+65+45 = 318, but columns 4:9 = 273! 
#you need to include the next column for column 9's value to count 

##re-arranging columns
cols=list(df.columns.values)
df_new=df[cols[0:4]+ [cols[-1]]+ cols[4:12]]
df_new


##dropping columns
#df_drop=df.drop(columns='Attack')
#df_drop.head()

Saving our Data (CSV, Excel, TXT)

##saving as a CSV file
##index=False gets rid of the index column
df_new.to_csv('modified.csv', index=False)

##saving to excel file
df_new.to_excel('modified.xlsx', index=False)

df_new.to_csv('modified.txt', index=False)