Start Learning for Free

Join over 1,000,000 other Data Science learners and start one of our interactive tutorials today!

Topic python small

Python Functions Tutorial

August 8th, 2017 in Python

Functions are an essential part of the Python programming language: you might have already encountered and used some of the many fantastic functions that are built-in in the Python language or that come with its library ecosystem. However, as a Data Scientist, you’ll constantly need to write your own functions to solve problems that your data poses to you.

That’s why this post will introduce you to functions in Python. You’ll cover the following topics:

(To practice further, try DataCamp’s Python Data Science Toolbox (Part 1) Course!)

Functions in Python

You use functions in programming to bundle a set of instructions that you want to use repeatedly or that, because of their complexity, are better self-contained in a sub-program and called when needed. That means that a function is a piece of code written to carry out a specified task. To carry out that specific task, the function might or might not need multiple inputs. When the task is carried out, the function can or can not return one or more values.

There are three types of functions in Python:

  • Built-in functions, such as help() to ask for help, min() to get the minimum value, print() to print an object to the terminal,… You can find an overview with more of these functions here.
  • User-Defined Functions (UDFs), which are functions that users create to help them out; And
  • Anonymous functions, which are also called lambda functions because they are not declared with the standard def keyword.

Functions vs Methods

A method refers to a function which is part of a class. You access it with an instance or object of the class. A function doesn’t have this restriction: it just refers to a standalone function. This means that all methods are functions but not all functions are methods.

Consider this example, where you first define a function plus() and then a Summation class with a sum() method:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGEgZnVuY3Rpb24gYHBsdXMoKWBcbmRlZiBwbHVzKGEsYik6XG4gIHJldHVybiBhICsgYlxuICBcbiMgQ3JlYXRlIGEgYFN1bW1hdGlvbmAgY2xhc3NcbmNsYXNzIFN1bW1hdGlvbihvYmplY3QpOlxuICBkZWYgc3VtKHNlbGYsIGEsIGIpOlxuICAgIHNlbGYuY29udGVudHMgPSBhICsgYlxuICAgIHJldHVybiBzZWxmLmNvbnRlbnRzICJ9

If you now want to call the sum() method that is part of the Summation class, you first need to define an instance or object of that class. So, let’s define such an object:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInByZV9leGVyY2lzZV9jb2RlIjoiZGVmIHBsdXMoYSxiKTpcbiAgcmV0dXJuIGEgKyBiXG4gIFxuY2xhc3MgU3VtbWF0aW9uKG9iamVjdCk6XG4gIGRlZiBzdW0oc2VsZiwgYSwgYik6XG4gICAgc2VsZi5jb250ZW50cyA9IGEgKyBiXG4gICAgcmV0dXJuIHNlbGYuY29udGVudHMgIiwic2FtcGxlIjoiIyBJbnN0YW50aWF0ZSBgU3VtbWF0aW9uYCBjbGFzcyB0byBjYWxsIGBzdW0oKWBcbnN1bUluc3RhbmNlID0gU3VtbWF0aW9uKClcbnN1bUluc3RhbmNlLnN1bSgxLDIpIn0=

Remember that this instantiation not necessary for when you want to call the function plus()! You would be able to execute plus(1,2) in the DataCamp Light code chunk without any problems!

Parameters vs Arguments

Parameters are the names used when defining a function or a method, and into which arguments will be mapped. In other words, arguments are the things which are supplied to any function or method call, while the function or method code refers to the arguments by their parameter names.

Consider the following example and look back to the above DataCamp Light chunk: you pass two arguments to the sum() method of the Summation class, even though you previously defined three parameters, namely, self, a and b.

What happened to self?

The first argument of every class method is always a reference to the current instance of the class, which in this case is Summation. By convention, this argument is called self.

This all means that you don’t pass the reference to self in this case because self is the parameter name for an implicitly passed argument that refers to the instance through which a method is being invoked. It gets inserted implicitly into the argument list.

How To Define A Function: User-Defined Functions (UDFs)

The four steps to defining a function in Python are the following:

  1. Use the keyword def to declare the function and follow this up with the function name.
  2. Add arguments to the function: they should be within the parentheses of the function. End your line with a colon.
  3. Add statements that the functions should execute.
  4. End your function with a return statement if the function should output something. Without the return statement, your function will return an object None.
eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImRlZiBoZWxsbygpOlxuICBwcmludChcIkhlbGxvIFdvcmxkXCIpIFxuICByZXR1cm4gIn0=

Of course, your functions will get more complex as you go along: you can add for loops, flow control, … and more to it to make it more finegrained:

def hello():
  name = str(input("Enter your name: "))
  if name:
    print ("Hello " + str(name))
  else:
    print("Hello World") 
  return 
  
hello()

In the above function, you ask the user to give a name. If no name is given, the function will print out “Hello World”. Otherwise, the user will get a personalized “Hello” response.

Remember also that you can define one or more function parameters for your UDF. You’ll learn more about this when you tackle the Function Arguments section. Additionally, you can or can not return one or multiple values as the result of your function.

The return Statement

Note that as you’re printing something in your UDF hello(), you don’t really need to return it. There won’t be any difference between the function above and this one:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImRlZiBoZWxsb19ub3JldHVybigpOlxuICBwcmludChcIkhlbGxvIFdvcmxkXCIpICJ9

However, if you want to continue to work with the result of your function and try out some operations on it, you will need to use the return statement to actually return a value, such as a String, an integer, …. Consider the following scenario, where hello() returns a String "hello", while the function hello_noreturn() returns None:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImRlZiBoZWxsbygpOlxuICBwcmludChcIkhlbGxvIFdvcmxkXCIpIFxuICByZXR1cm4oXCJoZWxsb1wiKVxuXG5kZWYgaGVsbG9fbm9yZXR1cm4oKTpcbiAgcHJpbnQoXCJIZWxsbyBXb3JsZFwiKVxuICBcbiMgTXVsdGlwbHkgdGhlIG91dHB1dCBvZiBgaGVsbG8oKWAgd2l0aCAyIFxuaGVsbG8oKSAqIDJcblxuIyAoVHJ5IHRvKSBtdWx0aXBseSB0aGUgb3V0cHV0IG9mIGBoZWxsb19ub3JldHVybigpYCB3aXRoIDIgXG5oZWxsb19ub3JldHVybigpICogMiJ9

The second function gives you an error because you can’t perform any operations with a None. You’ll get a TypeError that says that you can’t do the multiplication operation for NoneType (the None that is the result of hello_noreturn()) and int (2).

**Tip:** functions immediately exit when they come across a return statement, even if it means that they won’t return any value:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImRlZiBydW4oKTpcbiAgZm9yIHggaW4gcmFuZ2UoMTApOlxuICAgICBpZiB4ID09IDI6XG4gICAgICAgcmV0dXJuXG4gIHByaW50KFwiUnVuIVwiKVxuICBcbnJ1bigpIn0=

Another thing that is worth mentioning when you’re working with the return statement is the fact that you can use it to return multiple values. To do this, you make use of tuples.

Remember that this data structure is very similar to that of a list: it can contain multiple values. However, tuples are immutable, which means that you can’t modify any values that are stored in it! You construct it with the help of double parentheses (). You can unpack tuples into multiple variables with the help of the comma and the assignment operator.

Check out the following example to understand how your function can return multiple values:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgXG5kZWYgcGx1cyhhLGIpOlxuICBzdW0gPSBhICsgYlxuICByZXR1cm4gKHN1bSwgYSlcblxuIyBDYWxsIGBwbHVzKClgIGFuZCB1bnBhY2sgdmFyaWFibGVzIFxuc3VtLCBhID0gcGx1cygzLDQpXG5cbiMgUHJpbnQgYHN1bSgpYFxucHJpbnQoc3VtKSJ9

Note that the return statement return sum, a would have the same result as return (sum, a): the former actually packs sum and a into a tuple under the hood!

How To Call A Function

In the previous sections, you have seen a lot of examples already of how you can call a function. Calling a function means that you execute the function that you have defined - either directly from the Python prompt or through another function (as you will see in the section “Nested Functions”).

Call your newly defined function hello() by simply executing hello(), just like in the DataCamp Light chunk below:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInByZV9leGVyY2lzZV9jb2RlIjoiZGVmIGhlbGxvKCk6XG4gIHByaW50KFwiSGVsbG8gV29ybGRcIikgXG4gIHJldHVybiAiLCJzYW1wbGUiOiJoZWxsbygpIn0=

How To Add Docstrings To A Python Function

Another essential aspect of writing functions in Python: docstrings. Docstrings describe what your function does, such as the computations it performs or its return values. These descriptions serve as documentation for your function so that anyone who reads your function’s docstring understands what your function does, without having to trace through all the code in the function definition.

Function docstrings are placed in the immediate line after the function header and are placed in between triple quotation marks. An appropriate Docstring for your hello() function is ‘Prints “Hello World”’.

def hello():
"""Prints "Hello World".

Returns:
    None
"""
  print("Hello World") 
  return 

Note that docstrings can be quite longer than the one that is given here as an example. If you’d like to study docstrings in more detail, you best check out some Github repositories of Python libraries such as scikit-learn or pandas, where you’ll find plenty of examples!

Function Arguments in Python

Earlier, you learned about the difference between parameters and arguments. In short, arguments are the things which are given to any function or method call, while the function or method code refers to the arguments by their parameter names. There are four types of arguments that Python UDFs can take:

  • Default arguments
  • Required arguments
  • Keyword arguments
  • Variable number of arguments

Default Arguments

Default arguments are those that take a default value if no argument value is passed during the function call. You can assign this default value by with the assignment operator =, just like in the following example:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIGZ1bmN0aW9uXG5kZWYgcGx1cyhhLGIgPSAyKTpcbiAgcmV0dXJuIGEgKyBiXG4gIFxuIyBDYWxsIGBwbHVzKClgIHdpdGggb25seSBgYWAgcGFyYW1ldGVyXG5wbHVzKGE9MSlcblxuIyBDYWxsIGBwbHVzKClgIHdpdGggYGFgIGFuZCBgYmAgcGFyYW1ldGVyc1xucGx1cyhhPTEsIGI9MykifQ==

Required Arguments

As the name kind of gives away, the required arguments of a UDF are those that have to be in there. These arguments need to be passed during the function call and in exactly the right order, just like in the following example:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIHdpdGggcmVxdWlyZWQgYXJndW1lbnRzXG5kZWYgcGx1cyhhLGIpOlxuICByZXR1cm4gYSArIGIifQ==

You need arguments that map to the a as well as the b parameters to call the function without getting any errors. If you switch around a and b, the result won’t be different, but it might be if you change plus() to the following:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIHdpdGggcmVxdWlyZWQgYXJndW1lbnRzXG5kZWYgcGx1cyhhLGIpOlxuICByZXR1cm4gYS9iIn0=

Keyword Arguments

If you want to make sure that you call all the parameters in the right order, you can use the keyword arguments in your function call. You use these to identify the arguments by their parameter name. Let’s take the example from above to make this a bit more clear:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIGZ1bmN0aW9uXG5kZWYgcGx1cyhhLGIpOlxuICByZXR1cm4gYSArIGJcbiAgXG4jIENhbGwgYHBsdXMoKWAgZnVuY3Rpb24gd2l0aCBwYXJhbWV0ZXJzIFxucGx1cygyLDMpXG5cbiMgQ2FsbCBgcGx1cygpYCBmdW5jdGlvbiB3aXRoIGtleXdvcmQgYXJndW1lbnRzXG5wbHVzKGE9MSwgYj0yKSJ9

Note that by using the keyword arguments, you can also switch around the order of the parameters and still get the same result when you execute your function:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIGZ1bmN0aW9uXG5kZWYgcGx1cyhhLGIpOlxuICByZXR1cm4gYSArIGJcbiAgXG4jIENhbGwgYHBsdXMoKWAgZnVuY3Rpb24gd2l0aCBrZXl3b3JkIGFyZ3VtZW50c1xucGx1cyhiPTIsIGE9MSkifQ==

Variable Number of Arguments

In cases where you don’t know the exact number of arguments that you want to pass to a function, you can use the following syntax with *args:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIGZ1bmN0aW9uIHRvIGFjY2VwdCBhIHZhcmlhYmxlIG51bWJlciBvZiBhcmd1bWVudHNcbmRlZiBwbHVzKCphcmdzKTpcbiAgcmV0dXJuIHN1bShhcmdzKVxuXG4jIENhbGN1bGF0ZSB0aGUgc3VtXG5wbHVzKDEsNCw1KSJ9

The asterisk (*) is placed before the variable name that holds the values of all nonkeyword variable arguments. Note here that you might as well have passed *varint, *var_int_args or any other name to the plus() function.

Tip: try replacing *args with another name that includes the asterisk. You’ll see that the above code keeps working!

You see that the above function makes use of the built-in Python sum() function to sum all the arguments that get passed to plus(). If you would like to avoid this and build the function completely yourself, you can use this alternative:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgRGVmaW5lIGBwbHVzKClgIGZ1bmN0aW9uIHRvIGFjY2VwdCBhIHZhcmlhYmxlIG51bWJlciBvZiBhcmd1bWVudHNcbmRlZiBwbHVzKCphcmdzKTpcbiAgdG90YWwgPSAwXG4gIGZvciBpIGluIGFyZ3M6XG4gICAgdG90YWwgKz0gaVxuICByZXR1cm4gdG90YWxcblxuIyBDYWxjdWxhdGUgdGhlIHN1bSAgXG5wbHVzKDIwLDMwLDQwLDUwKSJ9

Global vs Local Variables

In general, variables that are defined inside a function body have a local scope, and those defined outside have a global scope. That means that local variables are defined within a function block and can only be accessed inside that function, while global variables can be accessed by all functions that might be in your script:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgR2xvYmFsIHZhcmlhYmxlIGBpbml0YFxuaW5pdCA9IDFcblxuIyBEZWZpbmUgYHBsdXMoKWAgZnVuY3Rpb24gdG8gYWNjZXB0IGEgdmFyaWFibGUgbnVtYmVyIG9mIGFyZ3VtZW50c1xuZGVmIHBsdXMoKmFyZ3MpOlxuICAjIExvY2FsIHZhcmlhYmxlIGBzdW0oKWBcbiAgdG90YWwgPSAwXG4gIGZvciBpIGluIGFyZ3M6XG4gICAgdG90YWwgKz0gaVxuICByZXR1cm4gdG90YWxcbiAgXG4jIEFjY2VzcyB0aGUgZ2xvYmFsIHZhcmlhYmxlXG5wcmludChcInRoaXMgaXMgdGhlIGluaXRpYWxpemVkIHZhbHVlIFwiICsgc3RyKGluaXQpKVxuXG4jIChUcnkgdG8pIGFjY2VzcyB0aGUgbG9jYWwgdmFyaWFibGVcbnByaW50KFwidGhpcyBpcyB0aGUgc3VtIFwiICsgc3RyKHRvdGFsKSkifQ==

You’ll see that you’ll get a NameError that says that the name 'total' is not defined when you try to print out the local variable total that was defined inside the function body. The init variable, on the other hand, can be printed out without any problems.

Anonymous Functions in Python

Anonymous functions are also called lambda functions in Python because instead of declaring them with the standard def keyword, you use the lambda keyword.

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImRvdWJsZSA9IGxhbWJkYSB4OiB4KjJcblxuZG91YmxlKDUpIn0=

In the DataCamp Light chunk above, lambda x: x*2 is the anonymous or lambda function. x is the argument, and x*2 is the expression or instruction that gets evaluated and returned. What’s special about this functions is that it has no name, like the examples that you have seen in the first part of this functions tutorial. If you would have to write the above function in a UDF, the result would be the following:

def double(x):
  return x*2

Let’s consider another example of a lambda function where you work with two arguments:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6IiMgYHN1bSgpYCBsYW1iZGEgZnVuY3Rpb25cbnN1bSA9IGxhbWJkYSB4LCB5OiB4ICsgeTtcblxuIyBDYWxsIHRoZSBgc3VtKClgIGFub255bW91cyBmdW5jdGlvblxuc3VtKDQsNSlcblxuIyBcIlRyYW5zbGF0ZVwiIHRvIGEgVURGXG5kZWYgc3VtKHgsIHkpOlxuICByZXR1cm4geCt5In0=

You use anonymous functions when you require a nameless function for a short period of time and that is created at runtime. Specific contexts in which this would be relevant is when you’re working with filter(), map() and reduce():

eyJsYW5ndWFnZSI6InB5dGhvbiIsInNhbXBsZSI6ImZyb20gZnVuY3Rvb2xzIGltcG9ydCByZWR1Y2VcblxubXlfbGlzdCA9IFsxLDIsMyw0LDUsNiw3LDgsOSwxMF1cblxuIyBVc2UgbGFtYmRhIGZ1bmN0aW9uIHdpdGggYGZpbHRlcigpYFxuZmlsdGVyZWRfbGlzdCA9IGxpc3QoZmlsdGVyKGxhbWJkYSB4OiAoeCoyID4gMTApLCBteV9saXN0KSlcblxuIyBVc2UgbGFtYmRhIGZ1bmN0aW9uIHdpdGggYG1hcCgpYFxubWFwcGVkX2xpc3QgPSBsaXN0KG1hcChsYW1iZGEgeDogeCoyLCBteV9saXN0KSlcblxuIyBVc2UgbGFtYmRhIGZ1bmN0aW9uIHdpdGggYHJlZHVjZSgpYFxucmVkdWNlZF9saXN0ID0gcmVkdWNlKGxhbWJkYSB4LCB5OiB4K3ksIG15X2xpc3QpXG5cbnByaW50KGZpbHRlcmVkX2xpc3QpXG5wcmludChtYXBwZWRfbGlzdClcbnByaW50KHJlZHVjZWRfbGlzdCkifQ==

The filter() function filters, as the name suggests, the original input list my_list on the basis of a criterion >10. With map(), on the other hand, you apply a function to all items of the list my_list. In this case, you multiply all elements with 2.

Note that the reduce() function is part of the functools library. You use this function cumulatively to the items of the my_list list, from left to right and reduce the sequence to a single value, 55, in this case.

Using main() as a Function

If you have any experience with other programming languages such as Java, you’ll know that a main function is required in order to execute functions. As you have seen in the examples above, this is not necessarily needed for Python. However, including a main() function in your Python program can be handy to structure your code in a logical way - all of the most important components are contained within this main() function.

You can easily define a main() function and call it just like you have done with all of the other functions above:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInByZV9leGVyY2lzZV9jb2RlIjoiZGVmIGhlbGxvKCk6XG4gIHByaW50KFwiSGVsbG8gV29ybGRcIikgXG4gIHJldHVybiAiLCJzYW1wbGUiOiIjIERlZmluZSBgbWFpbigpYCBmdW5jdGlvblxuZGVmIG1haW4oKTpcbiAgaGVsbG8oKVxuICBwcmludChcIlRoaXMgaXMgYSBtYWluIGZ1bmN0aW9uXCIpXG5cbm1haW4oKSJ9

However, as it stands now, the code of your main() function will be called when you import it as a module. To make sure that this doesn’t happen, you call the main() function when __name__ == '__main__'.

That means that the code of the above code chunk becomes:

eyJsYW5ndWFnZSI6InB5dGhvbiIsInByZV9leGVyY2lzZV9jb2RlIjoiZGVmIGhlbGxvKCk6XG4gIHByaW50KFwiSGVsbG8gV29ybGRcIikgXG4gIHJldHVybiAiLCJzYW1wbGUiOiIjIERlZmluZSBgbWFpbigpYCBmdW5jdGlvblxuZGVmIG1haW4oKTpcbiAgaGVsbG8oKVxuICBwcmludChcIlRoaXMgaXMgYSBtYWluIGZ1bmN0aW9uXCIpXG4gIFxuIyBFeGVjdXRlIGBtYWluKClgIGZ1bmN0aW9uIFxuaWYgX19uYW1lX18gPT0gJ19fbWFpbl9fJzpcbiAgICBtYWluKCkifQ==

Note that besides the __main__ function, you also have an __init__ function that initializes an instance of a class or an object. Simply stated, it acts as a constructor or initializer and is automatically called when you create a new instance of a class. With that function, the newly created object is assigned ot the parameter self, which you saw earlier in this tutorial. Take a look at the following example:

class Dog:
    """    
    Requires:
    legs - Legs so that the dog can walk.
    color - A color of the fur.
    """

    def __init__(self, legs, color):
        self.legs = legs
        self.color = color
        
    def bark(self):
        bark = "bark" * 2
        return bark

if __name__ == "__main__":
    dog = Dog(4, "brown")
    bark = dog.bark()
    print(bark)

Want To Practice Further?

Congrats! You have made it through this short tutorial on functions in Python. If you would like to revise other basic Python programming material, don’t miss out on Data Types for Data Science, a course where you’ll consolidate and practice your knowledge of lists, dictionaries, tuples, sets, and date times.

Comments

No comments yet. Be the first to respond!