Skip to content

Python Data Science Toolbox (Part 1)

Run the hidden code cell below to import the data used in this course.

# Import the course packages
import pandas as pd
from functools import reduce

# Import the dataset
tweets = pd.read_csv('datasets/tweets.csv')

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

USER-DEFINED FUNCTIONS

def square(): ---- this is the function header

things in the indent is the function body

call the function like this square()

parameter - add in between the parenthesis e.g. def square(value)

return a value from a function using return -- can assign value to a variable

docstrings -- define what your function does, documentation - placed in the line after the function header in between triple quotation marks

def square(x):
    """Input an integer value to be squared."""
    new_value = x ** 2
    return new_value

num = square(7)

print(num)

when using * with a string value will concatenate multiple copies of a value

Function bodies need to be indented by a consistent number of spaces and the choice of 4 is common.

Also note that shout(word), the part of the header that specifies the function name and parameter(s), is known as the signature of the function. You may encounter this term in the wild!

a print() call assigned to a variable has type NoneType, whereas if return is used in the function it will not be NoneType

assign values to tuple: vals = (2,4,6)

unpack a tuple into several variables: a, b, c = vals

access tuple elements in a similar way to lists

# Function with 2 parameters that returns two values using a tuple
def double(word1, word2):
    """Function which duplicates and concatenates the words."""
    dupe1 = word1*2
    dupe2 = word2*2
    return (dupe1, dupe2)

ret1, ret2 = double('stuff','otherstuff')

print(ret1)
print(ret2)

# Here dupe1 and dupe2 are local variables.

scope - part of the program where an object or name may be accessible

  • global - defined in the main body of a script or python program
  • local - defined within a function, after function is executed no longer exist so cannot access names outside of function definition
  • built-in scope - exist in built in modules

if python cannot find the name in the local scope only then will it look in the global scope, if not in global then built-in is searched. local > global > built-in

alter the value of a global variable within function call - use keyword global + name of variable you want to alter

Nested functions

defining innr function within outer function - python searches scope of inner function and then only outer function

useful if you need to perform a function many times within another function

def raise_val(n):
    """Return the inner function."""
    
    def inner(x):
        """Raise x to the power of n."""
        raised = x ** n
        return raised
    
    return inner

square = raise_val(2)
cube = raise_val(3)
print(square(2), cube(4))

Even after executing square, the function still remembers n = 2. This is known as closure.

In nested functions you can use keyword nonlocal to create and change names in an enclosing scope. If changed in inner function also changes in enclosing (outer) function.

def outer():
    """Prints the value of n."""
    n = 1
    
    def inner():
        nonlocal n
        n = 2
        print(n)
        
    print(n)    
    inner()
    print(n)
    
outer()

Order of scope

local > enclosing functions (if present) > global > builtin (LEGB rule)

Default: assigning names only creates/changes local names. -- Use keywords global and nonlocal to change this behaviour.

Closuer - the nested or inner function remembers the state of its enclosing scope when called, anything defined in the enclosing scope is available to the inner function even when the outer function has finished execution

def echo(n):
    """Return the inner_echo function."""
    
    def inner_echo(word):
        """Concatenate n copis of word."""
        echo_word = word * n
        return echo_word
    
    return inner_echo

print(echo)
print(echo(2))
print(echo(2)("Hiya!"))

twice = echo(2)
print(twice("Hiya!"))

Default and flexible arguments

in the function header = sign and value e.g. def power(number, pow=1):

if you don't know how many arguments there will be use *args e.g. def add_all(*args) -- this turns all arguments paased to a function call into a tuple called args in the function body