Functions are an essential part of the Python programming language: you might have already encountered and used some of the many fantastic functions that are built-in in the Python language or that come with its library ecosystem. However, as a Data Scientist, you’ll constantly need to write your own functions to solve problems that your data poses to you.
(To practice further, try DataCamp’s Python Data Science Toolbox (Part 1) Course!)
Run and edit the code from this tutorial onlineOpen Workspace
Functions in Python
You use functions in programming to bundle a set of instructions that you want to use repeatedly or that, because of their complexity, are better self-contained in a sub-program and called when needed. That means that a function is a piece of code written to carry out a specified task. To carry out that specific task, the function might or might not need multiple inputs. When the task is carried out, the function can or can not return one or more values.
There are three types of functions in Python:
- Built-in functions, such as
help()to ask for help,
min()to get the minimum value,
print()to print an object to the terminal,… You can find an overview with more of these functions here.
- User-Defined Functions (UDFs), which are functions that users create to help them out; And
- Anonymous functions, which are also called lambda functions because they are not declared with the standard
A method refers to a function which is part of a class. You access it with an instance or object of the class. A function doesn’t have this restriction: it just refers to a standalone function. This means that all methods are functions, but not all functions are methods.
Consider this example, where you first define a function
plus() and then a
Summation class with a
If you now want to call the
sum() method that is part of the
Summation class, you first need to define an instance or object of that class. So, let’s define such an object:
Remember that this instantiation not necessary for when you want to call the function
plus()! You would be able to execute
plus(1,2) in the DataCamp Light code chunk without any problems!
Parameters vs Arguments
Parameters are the names used when defining a function or a method, and into which arguments will be mapped. In other words, arguments are the things which are supplied to any function or method call, while the function or method code refers to the arguments by their parameter names.
Consider the following example and look back to the above DataCamp Light chunk: you pass two arguments to the
sum() method of the
Summation class, even though you previously defined three parameters, namely,
What happened to
The first argument of every class method is always a reference to the current instance of the class, which in this case is
Summation. By convention, this argument is called
This all means that you don’t pass the reference to
self in this case because
self is the parameter name for an implicitly passed argument that refers to the instance through which a method is being invoked. It gets inserted implicitly into the argument list.
How to Define a Function: User-Defined Functions (UDFs)
The four steps to defining a function in Python are the following:
- Use the keyword
defto declare the function and follow this up with the function name.
- Add parameters to the function: they should be within the parentheses of the function. End your line with a colon.
- Add statements that the functions should execute.
- End your function with a return statement if the function should output something. Without the return statement, your function will return an object
Of course, your functions will get more complex as you go along: you can add for loops, flow control, … and more to it to make it more finegrained:
def hello(): name = str(input("Enter your name: ")) if name: print ("Hello " + str(name)) else: print("Hello World") return hello()
In the above function, you ask the user to give a name. If no name is given, the function will print out “Hello World”. Otherwise, the user will get a personalized “Hello” response.
Remember also that you can define one or more function parameters for your UDF. You’ll learn more about this when you tackle the Function Arguments section. Additionally, you can or can not return one or multiple values as a result of your function.
Start Learning Python For Free
Note that as you’re printing something in your UDF
hello(), you don’t really need to return it. There won’t be any difference between the function above and this one:
However, if you want to continue to work with the result of your function and try out some operations on it, you will need to use the
return statement to actually return a value, such as a String, an integer, …. Consider the following scenario, where
hello() returns a String
"hello", while the function
The second function gives you an error because you can’t perform any operations with a
None. You’ll get a
TypeError that says that you can’t do the multiplication operation for
None that is the result of
Tip functions immediately exit when they come across a
return statement, even if it means that they won’t return any value:
Another thing that is worth mentioning when you’re working with the
return statement is the fact that you can use it to return multiple values. To do this, you make use of tuples.
Remember that this data structure is very similar to that of a list: it can contain multiple values. However, tuples are immutable, which means that you can’t modify any amounts that are stored in it! You construct it with the help of double parentheses
(). You can unpack tuples into multiple variables with the help of the comma and the assignment operator.
Check out the following example to understand how your function can return multiple values:
Note that the
return sum, a would have the same result as
return (sum, a): the former actually packs
a into a tuple under the hood!
How to Call a Function
In the previous sections, you have seen a lot of examples already of how you can call a function. Calling a function means that you execute the function that you have defined - either directly from the Python prompt or through another function (as you will see in the section “Nested Functions”).
Call your newly defined function
hello() by simply executing
hello(), just like in the DataCamp Light chunk below:
How to Add Docstrings to a Python Function
Another essential aspect of writing functions in Python: docstrings. Docstrings describe what your function does, such as the computations it performs or its return values. These descriptions serve as documentation for your function so that anyone who reads your function’s docstring understands what your function does, without having to trace through all the code in the function definition.
Function docstrings are placed in the immediate line after the function header and are placed in between triple quotation marks. An appropriate Docstring for your
hello() function is ‘Prints “Hello World”’.
def hello(): """Prints "Hello World". Returns: None """ print("Hello World") return
Note that docstrings can be more prolonged than the one that is given here as an example. If you’d like to study docstrings in more detail, you best check out some Github repositories of Python libraries such as scikit-learn or pandas, where you’ll find plenty of examples!
Function Arguments in Python
Earlier, you learned about the difference between parameters and arguments. In short, arguments are the things which are given to any function or method call, while the function or method code refers to the arguments by their parameter names. There are four types of arguments that Python UDFs can take:
- Default arguments
- Required arguments
- Keyword arguments
- Variable number of arguments
Default arguments are those that take a default value if no argument value is passed during the function call. You can assign this default value by with the assignment operator
=, just like in the following example:
As the name kind of gives away, the required arguments of a UDF are those that have to be in there. These arguments need to be passed during the function call and in precisely the right order, just like in the following example:
You need arguments that map to the
a as well as the
b parameters to call the function without getting any errors. If you switch around
b, the result won’t be different, but it might be if you change
plus() to the following:
If you want to make sure that you call all the parameters in the right order, you can use the keyword arguments in your function call. You use these to identify the arguments by their parameter name. Let’s take the example from above to make this a bit more clear:
Note that by using the keyword arguments, you can also switch around the order of the parameters and still get the same result when you execute your function:
Variable Number of Arguments
In cases where you don’t know the exact number of arguments that you want to pass to a function, you can use the following syntax with
The asterisk (
*) is placed before the variable name that holds the values of all nonkeyword variable arguments. Note here that you might as well have passed
*var_int_args or any other name to the
Tip: try replacing
*args with another name that includes the asterisk. You’ll see that the above code keeps working!
You see that the above function makes use of the built-in Python
sum() function to sum all the arguments that get passed to
plus(). If you would like to avoid this and build the function entirely yourself, you can use this alternative:
Global vs Local Variables
In general, variables that are defined inside a function body have a local scope, and those defined outside have a global scope. That means that local variables are defined within a function block and can only be accessed inside that function, while global variables can be obtained by all functions that might be in your script:
You’ll see that you’ll get a
NameError that says that the
name 'total' is not defined when you try to print out the local variable
total that was defined inside the function body. The
init variable, on the other hand, can be printed out without any problems.
Anonymous Functions in Python
Anonymous functions are also called lambda functions in Python because instead of declaring them with the standard
def keyword, you use the
In the DataCamp Light chunk above,
lambda x: x*2 is the anonymous or lambda function.
x is the argument, and
x*2 is the expression or instruction that gets evaluated and returned. What’s special about this function is that it has no name, like the examples that you have seen in the first part of this functions tutorial. If you had to write the above function in a UDF, the result would be the following:
def double(x): return x*2
Let’s consider another example of a lambda function where you work with two arguments:
You use anonymous functions when you require a nameless function for a short period of time, and that is created at runtime. Specific contexts in which this would be relevant is when you’re working with
filter() function filters, as the name suggests, the original input list
my_list on the basis of a criterion
map(), on the other hand, you apply a function to all items of the list
my_list. In this case, you multiply all elements with
Note that the
reduce() function is part of the
functools library. You use this function cumulatively to the items of the
my_list list, from left to right and reduce the sequence to a single value,
55, in this case.
main() as a Function
If you have any experience with other programming languages such as Java, you’ll know that the
main function is required to execute functions. As you have seen in the examples above, this is not necessarily needed for Python. However, including a
main() function in your Python program can be handy to structure your code logically - all of the most important components are contained within this
You can easily define a
main() function and call it just like you have done with all of the other functions above:
However, as it stands now, the code of your
main() function will be called when you import it as a module. To make sure that this doesn’t happen, you call the
main() function when
__name__ == '__main__'.
That means that the code of the above code chunk becomes:
Note that besides the
__main__ function, you also have an
__init__ function that initializes an instance of a class or an object. Simply stated, it acts as a constructor or initializer and is automatically called when you create a new instance of a class. With that function, the newly created object is assigned to the parameter self, which you saw earlier in this tutorial. Take a look at the following example:
class Dog: """ Requires: legs - Legs so that the dog can walk. color - A color of the fur. """ def __init__(self, legs, color): self.legs = legs self.color = color def bark(self): bark = "bark" * 2 return bark if __name__ == "__main__": dog = Dog(4, "brown") bark = dog.bark() print(bark)
Want to Practice further?
Congrats! You have made it through this short tutorial on functions in Python. If you would like to revise other basic Python programming material, don’t miss out on Data Types for Data Science, a course where you’ll consolidate and practice your knowledge of lists, dictionaries, tuples, sets, and date times.
Learn more about Python
What is Microsoft Fabric?
10 Essential Python Skills All Data Scientists Should Master
How is AI Transforming Data Management?
A Complete Guide to Socket Programming in Python
Performance and Scalability Unleashed: Mastering Single Table Database Design with DynamoDB
Textacy: An Introduction to Text Data Cleaning and Normalization in Python