Tutorials
must read
python
+1

Docstrings in Python

Learn about the different types of docstrings and various docstring formats like Sphinx, Numpy and Pydoc.

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.

Python documentation string or commonly known as docstring, is a string literal, and it is used in the class, module, function, or method definition. Docstrings are accessible from the doc attribute (__doc__) for any of the Python objects and also with the built-in help() function. An object's docstring is defined by including a string constant as the first statement in the object's definition.

Docstrings are great for understanding the functionality of the larger part of the code, i.e., the general purpose of any class, module, or function, whereas the comments are used for code, statement, and expressions, which tend to be small. They are a descriptive text written by a programmer mainly for themselves to know what the line of code or expression does and also for the developer who wishes to contribute to that project. It is an essential part that documenting your code is going to serve well enough for writing clean code and well-written programs.

Docstrings help you understand the capabilities of a module or a function. For example, let's say you installed the scikit-learn library and you would like to know all about the sklearn package like description, package modules, etc., you could simply use the help function to get all the information.

Let's quickly import the package.

import sklearn
help(sklearn)

You would see an output similar to the one shown below:

Docstrings vs. Commenting

Docstrings are similar in spirit to commenting, but they are enhanced, more logical, and useful version of commenting. Docstrings act as documentation for the class, module, and packages.

On the other hand, Comments are mainly used to explain non-obvious portions of the code and can be useful for comments on Fixing bugs and tasks that are needed to be done.

Docstrings are represented with closing & opening quotes while comments start with a # at the beginning.

Note that comments can not be accessed with the built-in doc attribute and help function. Let's see what happens if you try doing so:

def string_reverse(str1):
    #Returns the reversed String.

    #Parameters:
    #    str1 (str):The string which is to be reversed.

    #Returns:
    #    reverse(str1):The string which gets reversed.   

    reverse_str1 = ''
    i = len(str1)
    while i > 0:
        reverse_str1 += str1[i - 1]
        i = i- 1
    return reverse_str1
print(string_reverse.__doc__)
None
help(string_reverse)
Help on function string_reverse in module __main__:

string_reverse(str1)

There are a couple of ways of writing or using a Docstring, i.e., one-line docstring and multi-line docstring. Let's learn them one by one.

One-Line Docstring

The one-line Docstrings are the Docstrings, which fits all in one line. You can use one of the quotes, i.e., triple single or triple-double quotes, and opening quotes and closing quotes need to be the same. In the one-line Docstrings, closing quotes are in the same line as with the opening quotes. Also, the standard convention is to use the triple-double quotes.

def square(a):
    '''Returned argument a is squared.'''
    return a**a
print (square.__doc__)
Returned argument a is squared.
help(square)
Help on function square in module __main__:

square(a)
    Returned argument a is squared.

From the above Docstring output, you can observe that:

  • In this case, the line begins with a capital letter, i.e., R and ends with a period (.).
  • The closing quotes are on the same line as the opening quotes. This looks better for one-liners.
  • A good practice to follow is having no blank line either before or after the Docstring, as shown in the above example.
  • The output of the __doc__ attribute is less verbose as compared to the help() function.

Multi-Line Docstring

Multi-line Docstrings also contains the same string literals line as in One-line Docstrings, but it is followed by a single blank along with the descriptive text.

The general format for writing a Multi-line Docstring is as follows:

def some_function(argument1):
    """Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int:Returning value

   """

    return argument1

print(some_function.__doc__)
Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int:Returning value
help(some_function)
Help on function some_function in module __main__:

some_function(argument1)
    Summary or Description of the Function

    Parameters:
    argument1 (int): Description of arg1

    Returns:
    int:Returning value

Let's look at the example which can show how the multi-line strings can be used in detail:

def string_reverse(str1):
    '''
    Returns the reversed String.

    Parameters:
        str1 (str):The string which is to be reversed.

    Returns:
        reverse(str1):The string which gets reversed.   
    '''

    reverse_str1 = ''
    i = len(str1)
    while i > 0:
        reverse_str1 += str1[i - 1]
        i = i- 1
    return reverse_str1
print(string_reverse('DeepLearningDataCamp'))
pmaCataDgninraeLpeeD
help(string_reverse)
Help on function string_reverse in module __main__:

string_reverse(str1)
    Returns the reversed String.

    Parameters:
        str1 (str):The string which is to be reversed.

    Returns:
        reverse(str1):The string which gets reversed.

You can see above that the summary line is on one line and is also separated from other content by a single blank line. This convention needs to be followed, which is useful for the automatic indexing tools.

Python Built-in Docstring

Let's view the built-in Python Docstrings.

All the built-in functions, classes, methods have the actual human description attached to it. You can access it in one of two ways.

  • doc attribute
  • The help function

You would notice that the output of the help function is more verbose than the __doc__ attribute.

For example:

import math
print(math.__doc__)
This module provides access to the mathematical functions
defined by the C standard.

Similarly, you can use the help function:

help(math)

Let's now look at some popular Docstring Formats and understand them in detail.

Docstring Formats

There are many Docstrings format available, but it is always better to use the formats which are easily recognized by the Docstring parser and also to fellow Data Scientist/programmers. There are no rules and regulations for selecting a Docstring format, but the consistency of choosing the same format over the project is necessary. Also, It is preferred for you to use the formatting type, which is mostly supported by Sphinx.

The most common formats used are listed below.

Formatting Type Description
NumPy/SciPy docstrings Combination of reStructured and GoogleDocstrings and supported by Sphinx
PyDoc Standard documentation module for Python and supported by Sphinx
EpyDoc Render Epytext as series of HTML documents and a tool for generating API documentation for Python modules based on their Docstrings
Google Docstrings Google's Style

There might be different documentation strings available. You need not need to worry about the fact that you have to reinvent the wheel to study all. The formats of all the Documentation strings are nearly similar. The patterns are similar, but there are only nitty-gritty changes in each format. You'll be looking over the example of a popular format for documentation string available with their use.

At first, you will be seeing the Sphinx Style in detail, and then you can easily follow along with other formats as well.

Sphinx Style

Sphinx is the easy and traditional style, verbose, and was initially created specifically for the Python Documentation. Sphinx uses a reStructured Text, which is similar in usage to Markdown.

class Vehicle(object):
    '''
    The Vehicle object contains lots of vehicles
    :param arg: The arg is used for ...
    :type arg: str
    :param `*args`: The variable arguments are used for ...
    :param `**kwargs`: The keyword arguments are used for ...
    :ivar arg: This is where we store arg
    :vartype arg: str
    '''


    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance, destination):
        '''We can't travel a certain distance in vehicles without fuels, so here's the fuels

        :param distance: The amount of distance traveled
        :type amount: int
        :param bool destinationReached: Should the fuels be refilled to cover required distance?
        :raises: :class:`RuntimeError`: Out of fuel

        :returns: A Car mileage
        :rtype: Cars
        '''  
        pass

Sphinx uses the keyword(reserved word); most of the programming language does. But it is explicitly called role in Sphinx. In the above code, Sphinx has the param as a role, and type is a role, which is the Sphinx data type for param. type role is optional, but param is mandatory. The return roles document the returned object. It is different from the param role. The return role is not dependent on the rtype and vice-versa. The rtype is the type of object returned from the given function.

Google Style

Google Style is easier and more intuitive to use. It can be used for the shorter form of documentation. A configuration of python file needs to be done to get started, so you need to add either sphinx.ext.napoleon or sphinxcontrib.napoleon to the extensions list in conf.py.

class Vehicles(object):
    '''
    The Vehicle object contains a lot of vehicles

    Args:
        arg (str): The arg is used for...
        *args: The variable arguments are used for...
        **kwargs: The keyword arguments are used for...

    Attributes:
        arg (str): This is where we store arg,
    '''
    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance,destination):
        '''We can't travel distance in vehicles without fuels, so here is the fuels

        Args:
            distance (int): The amount of distance traveled
            destination (bool): Should the fuels refilled to cover the distance?

        Raises:
            RuntimeError: Out of fuel

        Returns:
            cars: A car mileage
        '''
        pass

The Google Style is better than the Sphinx style. It also has an inconvenient feature, i.e., in the above code, the multi-line description of the distance would look messy. That is why the Numpy can be used for the more extended form of documentation.

Numpy Style

Numpy style has a lot of details in the documentation. It is more verbose than other documentation, but it is an excellent choice if you want to do detailed documentation, i.e., extensive documentation of all the functions and parameters.

class Vehicles(object):
    '''
    The Vehicles object contains lots of vehicles

    Parameters
    ----------
    arg : str
        The arg is used for ...
    *args
        The variable arguments are used for ...
    **kwargs
        The keyword arguments are used for ...

    Attributes
    ----------
    arg : str
        This is where we store arg,
    '''
    def __init__(self, arg, *args, **kwargs):
        self.arg = arg

    def cars(self, distance, destination):
        '''We can't travel distance in vehicles without fuels, so here is the fuels

        Parameters
        ----------
        distance : int
            The amount of distance traveled
        destination : bool
            Should the fuels refilled to cover the distance?

        Raises
        ------
        RuntimeError
            Out of fuel

        Returns
        -------
        cars
            A car mileage
        '''
        pass

The above example is more verbose than any other documentation. It is more lengthy and could only be used for the long and detailed documentation.

PyDoc

As you learned that docstrings are accessible through the built-in Python __doc__ attribute and the help() function. You could also make use of the built-in module known as Pydoc, which is very different in terms of the features & functionalities it possesses when compared to the doc attribute and the help function.

Pydoc is a tool that would come handy when you want to share the code with your colleagues or make it open-source, in which case you would be targeting a much wider audience. It could generate web pages from your Python documentation and can also launch a web server.

Let's see how it works.

The easiest and convenient way to run the Pydoc module is to run it as a script. To run it inside the jupyter lab cell, you would make use of the exclamation mark (!) character.

  • Pydoc as a module
!python -m pydoc
pydoc - the Python documentation tool

pydoc <name> ...
    Show text documentation on something.  <name> may be the name of a
    Python keyword, topic, function, module, or package, or a dotted
    reference to a class or function within a module or module in a
    package.  If <name> contains a '\', it is used as the path to a
    Python source file to document. If name is 'keywords', 'topics',
    or 'modules', a listing of these things is displayed.

pydoc -k <keyword>
    Search for a keyword in the synopsis lines of all available modules.

pydoc -n <hostname>
    Start an HTTP server with the given hostname (default: localhost).

pydoc -p <port>
    Start an HTTP server on the given port on the local machine.  Port
    number 0 can be used to get an arbitrary unused port.

pydoc -b
    Start an HTTP server on an arbitrary unused port and open a Web browser
    to interactively browse documentation.  This option can be used in
    combination with -n and/or -p.

pydoc -w <name> ...
    Write out the HTML documentation for a module to a file in the current
    directory.  If <name> contains a '\', it is treated as a filename; if
    it names a directory, documentation is written for all the contents.

If you look at the above output, the very first use of Pydoc is to show text documentation on a function, module, class, etc. so let's see how you can leverage that better than the help function.

!python -m pydoc glob
Help on module glob:

NAME
    glob - Filename globbing utility.

MODULE REFERENCE
    https://docs.python.org/3.7/library/glob

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

FUNCTIONS
    escape(pathname)
        Escape all special characters.

    glob(pathname, *, recursive=False)
        Return a list of paths matching a pathname pattern.

        The pattern may contain simple shell-style wildcards a la
        fnmatch. However, unlike fnmatch, filenames starting with a
        dot are special cases that are not matched by '*' and '?'
        patterns.

        If recursive is true, the pattern '**' will match any files and
        zero or more directories and subdirectories.

    iglob(pathname, *, recursive=False)
        Return an iterator which yields the paths matching a pathname pattern.

        The pattern may contain simple shell-style wildcards a la
        fnmatch. However, unlike fnmatch, filenames starting with a
        dot are special cases that are not matched by '*' and '?'
        patterns.

        If recursive is true, the pattern '**' will match any files and
        zero or more directories and subdirectories.

DATA
    __all__ = ['glob', 'iglob', 'escape']

FILE
    c:\users\hda3kor\.conda\envs\test\lib\glob.py

Now, let's extract the glob documentation using the help function.

help(glob)
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-6-6f504109e3a2> in <module>
----> 1 help(glob)


NameError: name 'glob' is not defined

Well, as you can see, it throws a name error as glob is not defined. So for you to use the help function for extracting the documentation, you need first to import that module, which is not the case in Pydoc.

  • Pydoc as a web-service

Let's explore the most interesting feature of the Pydoc module, i.e., running Pydoc as a web service.

To do this, you would simply run the Pydoc as a script but with a -b argument which will start an HTTP server on an arbitrary unused port and open a Web browser to interactively browse the documentation. This is helpful, especially when you have various other services running on your system, and you do not remember which port would be in an idle state.

!python -m pydoc -b
^C

The moment you run the above cell, a new window will open on an arbitrary port number, and the web browser will look similar to the one shown below.

Let's look at the documentation of the h5py module, which is a file format used to store weights of neural network architecture.

Conclusion

Congratulations on finishing the tutorial.

This tutorial primarily focused on getting you started with docstrings by covering the essential topics. However, Docstrings is a very vast topic, and some concepts might have been left unexplored. If you would like to learn more, then check out the Python DocStrings PEP257.

If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.