Skip to content
Introduction to Data Science in Python
  • AI Chat
  • Code
  • Report
  • Introduction to Data Science in Python

    Run the hidden code cell below to import the data used in this course.

    # Importing pandas and numpy
    import numpy as np
    import pandas as pd
    
    # Importing the course datasets
    frequencies = pd.read_csv("datasets/all_frequencies.csv")
    records = pd.read_csv("datasets/cell_phone_records.csv")
    credit = pd.read_csv("datasets/credit_records.csv")
    ransom = pd.read_csv("datasets/ransom.csv")
    gravel = pd.read_csv("datasets/shoe_gravel_sample.csv")

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Add your notes here

    # Add your code snippets here
    
    import numpy as np #np is an alias which represents numpy
    # pandas module is used for creating a dataframe and reads comma seperated files.
    # Taking a look into a dataframe, we should use the following functions:
    
    frequencies.head() # head is used for to check the first few lines of the dataframe. First 5 lines.
    frequencies.tail() # tail is used for to check the last few lines of the dataframe. Last 5 lines.
    frequencies.info() # info function gives us an information about data types, column names and range of the DataFrame.
    frequencies.shape # shape gives us the shapes of the dataframe. For example, (22,33) means the dataframe has 22 rows and 33 columns.
    
    
    
    
    
    # Selecting specific variables in pandas DataFrame.
    
    # Let us bring a csv file for future examples.
    
    import pandas as pd
    
    gravel = pd.read_csv("datasets/shoe_gravel_sample.csv")
    print(gravel)
    
    # Gathering some information about the DataFrame:
    gravel.head()
    gravel.tail()
    gravel.info()
    gravel.shape
    
    # If a column contains only letters, underscores or float numbers, dot notation can be used.
    gravel.radius
    
    # There is another way to do that:
    # Remember that, if our column name contains any spaces or special characters, this way is a must.
    gravel_radius = gravel['radius'] # if you are not using square brackets, Python thinks that you are passing a function.
    print(gravel_radius)
    # Logical statements in Python.
    # Boolean values are True or False.
    # There are few logical statements in Python.
    # Greater than or equal to >=
    # Less than or equal to <=
    # Greater than >
    # Less than <
    # You can check values if they are equal or not via ==.
    
    distance_A = 200 # km
    distance_B = 160 # km
    
    scenario_1 = distance_A == distance_B # check if they are equal.
    print(scenario_1)
    scenario_2 = distance_A <= distance_B # check if distance_B is less than or equal to A.
    print(scenario_2)
    scenario_3 = distance_A >= distance_B # check if distance_A is greater than or equal to B.
    print(scenario_3)
    
    # For strings:
    
    name = 'Ahmet'
    name_2 = 'Mustafa'
    
    name != name_2
    name == name_2
    # Creating plots via using matplotlib in Python.
    
    # First, matplotlib must be imported in order to use it.
    import pandas as pd  
    from matplotlib import pyplot as plt 
    
    # Select the DataFrame:
    ransom = pd.read_csv("datasets/ransom.csv")
    
    # Gathering information about DataFrame.
    ransom.head()
    ransom.tail()
    ransom.info()
    ransom.shape
    ransom.describe()
    
    # Creating Plots and adding labels:
    
    plt.plot(ransom.letter_index, ransom.letter)
    plt.xlabel("Letter Index")
    plt.ylabel("Letter")
    plt.title("Ransom")
    # Show the plot.
    plt.show()

    Yarın Use a Dataset kullanarak tüm detayları kendin eklediğin bir notebook oluştur.