Docstrings in Python
If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.
Python documentation string or commonly known as docstring, is a string literal, and it is used in the class, module, function, or method definition. Docstrings are accessible from the doc attribute
(__doc__) for any of the Python objects and also with the built-in
help() function. An object's docstring is defined by including a string constant as the first statement in the object's definition.
Docstrings are great for understanding the functionality of the larger part of the code, i.e., the general purpose of any class, module, or function, whereas the comments are used for code, statement, and expressions, which tend to be small. They are a descriptive text written by a programmer mainly for themselves to know what the line of code or expression does and also for the developer who wishes to contribute to that project. It is an essential part that documenting your code is going to serve well enough for writing clean code and well-written programs.
Docstrings help you understand the capabilities of a module or a function. For example, let's say you installed the
scikit-learn library and you would like to know all about the
sklearn package like description, package modules, etc., you could simply use the
help function to get all the information.
Let's quickly import the package.
You would see an output similar to the one shown below:
Docstrings vs. Commenting
Docstrings are similar in spirit to commenting, but they are enhanced, more logical, and useful version of commenting. Docstrings act as documentation for the class, module, and packages.
On the other hand, Comments are mainly used to explain non-obvious portions of the code and can be useful for comments on Fixing bugs and tasks that are needed to be done.
Docstrings are represented with closing & opening quotes while comments start with a
# at the beginning.
Note that comments can not be accessed with the built-in
doc attribute and
help function. Let's see what happens if you try doing so:
def string_reverse(str1): #Returns the reversed String. #Parameters: # str1 (str):The string which is to be reversed. #Returns: # reverse(str1):The string which gets reversed. reverse_str1 = '' i = len(str1) while i > 0: reverse_str1 += str1[i - 1] i = i- 1 return reverse_str1
Help on function string_reverse in module __main__: string_reverse(str1)
There are a couple of ways of writing or using a Docstring, i.e.,
one-line docstring and
Let's learn them one by one.
The one-line Docstrings are the Docstrings, which fits all in one line. You can use one of the quotes, i.e., triple single or triple-double quotes, and opening quotes and closing quotes need to be the same. In the one-line Docstrings, closing quotes are in the same line as with the opening quotes. Also, the standard convention is to use the triple-double quotes.
def square(a): '''Returned argument a is squared.''' return a**a
Returned argument a is squared.
Help on function square in module __main__: square(a) Returned argument a is squared.
From the above Docstring output, you can observe that:
- In this case, the line begins with a capital letter, i.e., R and ends with a period
- The closing quotes are on the same line as the opening quotes. This looks better for one-liners.
- A good practice to follow is having no blank line either before or after the Docstring, as shown in the above example.
- The output of the
__doc__attribute is less verbose as compared to the
Multi-line Docstrings also contains the same string literals line as in One-line Docstrings, but it is followed by a single blank along with the descriptive text.
The general format for writing a Multi-line Docstring is as follows:
def some_function(argument1): """Summary or Description of the Function Parameters: argument1 (int): Description of arg1 Returns: int:Returning value """ return argument1 print(some_function.__doc__)
Summary or Description of the Function Parameters: argument1 (int): Description of arg1 Returns: int:Returning value
Help on function some_function in module __main__: some_function(argument1) Summary or Description of the Function Parameters: argument1 (int): Description of arg1 Returns: int:Returning value
Let's look at the example which can show how the multi-line strings can be used in detail:
def string_reverse(str1): ''' Returns the reversed String. Parameters: str1 (str):The string which is to be reversed. Returns: reverse(str1):The string which gets reversed. ''' reverse_str1 = '' i = len(str1) while i > 0: reverse_str1 += str1[i - 1] i = i- 1 return reverse_str1
Help on function string_reverse in module __main__: string_reverse(str1) Returns the reversed String. Parameters: str1 (str):The string which is to be reversed. Returns: reverse(str1):The string which gets reversed.
You can see above that the summary line is on one line and is also separated from other content by a single blank line. This convention needs to be followed, which is useful for the automatic indexing tools.
Python Built-in Docstring
Let's view the built-in Python Docstrings.
All the built-in functions, classes, methods have the actual human description attached to it. You can access it in one of two ways.
- doc attribute
- The help function
You would notice that the output of the
help function is more verbose than the
import math print(math.__doc__)
This module provides access to the mathematical functions defined by the C standard.
Similarly, you can use the help function:
Let's now look at some popular Docstring Formats and understand them in detail.
There are many Docstrings format available, but it is always better to use the formats which are easily recognized by the Docstring parser and also to fellow Data Scientist/programmers. There are no rules and regulations for selecting a Docstring format, but the consistency of choosing the same format over the project is necessary. Also, It is preferred for you to use the formatting type, which is mostly supported by Sphinx.
The most common formats used are listed below.
|NumPy/SciPy docstrings||Combination of reStructured and GoogleDocstrings and supported by Sphinx|
|PyDoc||Standard documentation module for Python and supported by Sphinx|
|EpyDoc||Render Epytext as series of HTML documents and a tool for generating API documentation for Python modules based on their Docstrings|
|Google Docstrings||Google's Style|
There might be different documentation strings available. You need not need to worry about the fact that you have to reinvent the wheel to study all. The formats of all the Documentation strings are nearly similar. The patterns are similar, but there are only nitty-gritty changes in each format. You'll be looking over the example of a popular format for documentation string available with their use.
At first, you will be seeing the Sphinx Style in detail, and then you can easily follow along with other formats as well.
Sphinx is the easy and traditional style, verbose, and was initially created specifically for the Python Documentation. Sphinx uses a reStructured Text, which is similar in usage to Markdown.
class Vehicle(object): ''' The Vehicle object contains lots of vehicles :param arg: The arg is used for ... :type arg: str :param `*args`: The variable arguments are used for ... :param `**kwargs`: The keyword arguments are used for ... :ivar arg: This is where we store arg :vartype arg: str ''' def __init__(self, arg, *args, **kwargs): self.arg = arg def cars(self, distance, destination): '''We can't travel a certain distance in vehicles without fuels, so here's the fuels :param distance: The amount of distance traveled :type amount: int :param bool destinationReached: Should the fuels be refilled to cover required distance? :raises: :class:`RuntimeError`: Out of fuel :returns: A Car mileage :rtype: Cars ''' pass
Sphinx uses the
keyword(reserved word); most of the programming language does. But it is explicitly called
role in Sphinx. In the above code, Sphinx has the
param as a role, and
type is a role, which is the Sphinx data type for
type role is optional, but
param is mandatory. The return roles document the returned object. It is different from the param role. The return role is not dependent on the rtype and vice-versa. The rtype is the type of object returned from the given function.
Google Style is easier and more intuitive to use. It can be used for the shorter form of documentation. A configuration of python file needs to be done to get started, so you need to add either sphinx.ext.napoleon or sphinxcontrib.napoleon to the extensions list in conf.py.
class Vehicles(object): ''' The Vehicle object contains a lot of vehicles Args: arg (str): The arg is used for... *args: The variable arguments are used for... **kwargs: The keyword arguments are used for... Attributes: arg (str): This is where we store arg, ''' def __init__(self, arg, *args, **kwargs): self.arg = arg def cars(self, distance,destination): '''We can't travel distance in vehicles without fuels, so here is the fuels Args: distance (int): The amount of distance traveled destination (bool): Should the fuels refilled to cover the distance? Raises: RuntimeError: Out of fuel Returns: cars: A car mileage ''' pass
The Google Style is better than the Sphinx style. It also has an inconvenient feature, i.e., in the above code, the multi-line description of the distance would look messy. That is why the Numpy can be used for the more extended form of documentation.
Numpy style has a lot of details in the documentation. It is more verbose than other documentation, but it is an excellent choice if you want to do detailed documentation, i.e., extensive documentation of all the functions and parameters.
class Vehicles(object): ''' The Vehicles object contains lots of vehicles Parameters ---------- arg : str The arg is used for ... *args The variable arguments are used for ... **kwargs The keyword arguments are used for ... Attributes ---------- arg : str This is where we store arg, ''' def __init__(self, arg, *args, **kwargs): self.arg = arg def cars(self, distance, destination): '''We can't travel distance in vehicles without fuels, so here is the fuels Parameters ---------- distance : int The amount of distance traveled destination : bool Should the fuels refilled to cover the distance? Raises ------ RuntimeError Out of fuel Returns ------- cars A car mileage ''' pass
The above example is more verbose than any other documentation. It is more lengthy and could only be used for the long and detailed documentation.
As you learned that docstrings are accessible through the built-in Python
__doc__ attribute and the
help() function. You could also make use of the built-in module known as
Pydoc, which is very different in terms of the features & functionalities it possesses when compared to the doc attribute and the help function.
Pydoc is a tool that would come handy when you want to share the code with your colleagues or make it open-source, in which case you would be targeting a much wider audience. It could generate web pages from your Python documentation and can also launch a web server.
Let's see how it works.
The easiest and convenient way to run the Pydoc module is to run it as a script. To run it inside the jupyter lab cell, you would make use of the exclamation mark (!) character.
- Pydoc as a module
!python -m pydoc
pydoc - the Python documentation tool pydoc <name> ... Show text documentation on something. <name> may be the name of a Python keyword, topic, function, module, or package, or a dotted reference to a class or function within a module or module in a package. If <name> contains a '\', it is used as the path to a Python source file to document. If name is 'keywords', 'topics', or 'modules', a listing of these things is displayed. pydoc -k <keyword> Search for a keyword in the synopsis lines of all available modules. pydoc -n <hostname> Start an HTTP server with the given hostname (default: localhost). pydoc -p <port> Start an HTTP server on the given port on the local machine. Port number 0 can be used to get an arbitrary unused port. pydoc -b Start an HTTP server on an arbitrary unused port and open a Web browser to interactively browse documentation. This option can be used in combination with -n and/or -p. pydoc -w <name> ... Write out the HTML documentation for a module to a file in the current directory. If <name> contains a '\', it is treated as a filename; if it names a directory, documentation is written for all the contents.
If you look at the above output, the very first use of Pydoc is to show text documentation on a function, module, class, etc. so let's see how you can leverage that better than the help function.
!python -m pydoc glob
Help on module glob: NAME glob - Filename globbing utility. MODULE REFERENCE https://docs.python.org/3.7/library/glob The following documentation is automatically generated from the Python source files. It may be incomplete, incorrect or include features that are considered implementation detail and may vary between Python implementations. When in doubt, consult the module reference at the location listed above. FUNCTIONS escape(pathname) Escape all special characters. glob(pathname, *, recursive=False) Return a list of paths matching a pathname pattern. The pattern may contain simple shell-style wildcards a la fnmatch. However, unlike fnmatch, filenames starting with a dot are special cases that are not matched by '*' and '?' patterns. If recursive is true, the pattern '**' will match any files and zero or more directories and subdirectories. iglob(pathname, *, recursive=False) Return an iterator which yields the paths matching a pathname pattern. The pattern may contain simple shell-style wildcards a la fnmatch. However, unlike fnmatch, filenames starting with a dot are special cases that are not matched by '*' and '?' patterns. If recursive is true, the pattern '**' will match any files and zero or more directories and subdirectories. DATA __all__ = ['glob', 'iglob', 'escape'] FILE c:\users\hda3kor\.conda\envs\test\lib\glob.py
Now, let's extract the
glob documentation using the help function.
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-6-6f504109e3a2> in <module> ----> 1 help(glob) NameError: name 'glob' is not defined
Well, as you can see, it throws a name error as glob is not defined. So for you to use the help function for extracting the documentation, you need first to import that module, which is not the case in Pydoc.
- Pydoc as a web-service
Let's explore the most interesting feature of the Pydoc module, i.e., running Pydoc as a web service.
To do this, you would simply run the Pydoc as a script but with a
-b argument which will start an HTTP server on an arbitrary unused port and open a Web browser to interactively browse the documentation. This is helpful, especially when you have various other services running on your system, and you do not remember which port would be in an idle state.
!python -m pydoc -b
The moment you run the above cell, a new window will open on an arbitrary port number, and the web browser will look similar to the one shown below.
Let's look at the documentation of the
h5py module, which is a file format used to store weights of neural network architecture.
Congratulations on finishing the tutorial.
This tutorial primarily focused on getting you started with docstrings by covering the essential topics. However, Docstrings is a very vast topic, and some concepts might have been left unexplored. If you would like to learn more, then check out the Python DocStrings PEP257.
If you are just getting started in Python and would like to learn more, take DataCamp's Introduction to Data Science in Python course.