Coding is the art of writing the instructions – also known as algorithms – for a computer to do a specific task. To communicate with computers, developers use programming languages.
Like natural languages, such as English, Russian, or Quechua, programming languages comprise a specific set of syntactic and semantic rules that provide the basics for communication. Although natural languages are more complex, flexible, and dynamic, these attributes are also applicable, yet to a more limited extent, to programming languages.
You can write even the simplest algorithm in many different ways. While some flexibility is desirable when developing code, it can compromise efficiency and effective communication, especially when multiple people are working on it.
As such, readability is a critical aspect when writing code. To ensure that developers are on the same page, most programming languages have developed coding standards. These documents provide guidelines and best practices to produce readable, maintainable, and scalable code.
In this article, we will discover the coding best practices for Python, one of the most popular data science languages. The practices presented in the following sections are mostly on PEP 8, the standard guide of best practices for writing code in Python. To learn more about what Python is used for, check out our article or our Introduction to Python course.
What is PEP 8?
“Code is read much more often than it is written. Code should always be written in a way that promotes readability” – Guido van Rossum, the inventor of Python
PEP 8, is a style guide for Python code. Written in 2001 by Guido van Rossum, Barry Warsaw, and Nick Coghlan, PEP 8 provides a set of recommendations to write more readable and consistent code. It covers everything from how to name variables, to the maximum number of characters that a line should have.
PEP stands for Python Enhancement Proposal. A PEP is a document that describes new features proposed for Python and documents aspects of Python, like design and style.
While not mandatory, much of the Python community uses PEP 8. So it is highly advisable to follow the guidelines. By doing that, you will improve your credentials as a professional programmer.
Despite the wide acceptance of PEP 8, the guidelines may not fit all particular scenarios. In those cases, it’s a common practice for companies to define their own conventions.
6 Python Best Practices
Python best practices for code quality
Writing readable and organized code can make a difference and boost your career prospects. While coding may seem a mechanic and complex process for beginners, the truth is that coding is an art.
There are many tips you can follow to increase code quality in Python. Here is a list of some of the most relevant.
The indentation tyranny
Indentation refers to the spaces at the beginning of a code line. While indentation in other programming languages is only a resource for readability, indentation in Python is mandatory. Python uses indentation to open a block of code. More precisely, 4 consecutive spaces per indentation level, as shown in the following code:
for number in [1,2,3,4]: if number % 2 == 0: print(number) else: continue
Maximum line length
PEP 8 recommends that no line should be longer than 79 characters. This makes sense, as shorter lines are easier to read. Also, this length allows you to have multiple files open next to one another.
Surround top-level function and class definitions with two blank lines. Method definitions inside a class are surrounded by a single blank line. Extra blank lines may be used (sparingly) to separate groups of related functions. Finally, use blank lines in functions, (sparingly) to indicate logical sections.
Use linters and autoformaters
Code mastery takes time. Paying attention to all the details while coding may be challenging and time-consuming. Fortunately, machines can help us to ensure code quality, in particular, linters and formatters.
Linters perform static analysis of source codes and check for symantic discrepancies. Formatters are similar tools that try to restructure your code spacing, line length, argument positioning, and so on to ensure that your code looks consistent across different files or projects. Python offers you a plethora of linters and formatters to choose from.
Keep in mind principles
While some of the aforementioned rules are straightforward, more often than not, coding is about good taste and intuition. To become a coding artist, you should know about some of the principles that underpin Python. A great example of such principles is the Zen of Python, covered in a separate article.
Python logging best practices
Logging is a means of tracking events that happen when some software runs. Especially when applications grow in size and complexity, logging becomes a critical technique for developing, debugging, running, and tracking performance.
To deal with logging practices, the logging module has been a part of Python’s Standard Library since version 2.3. The module is the first go-to package for most Python developers when it comes to logging. It is extensively described in PEP 282.
Logging is similar in spirit to a Print function. However, a print function lacks a lot of information that might be useful to a developer. Logging, however, can log the timestamps and line number at which the error occurred. It can also log errors or any information to files, sockets, etc.
To import the module, you just have to run this code:
Not every log message is similar. Indeed, the module provides a category of logging levels according to the gravity of the message. It goes as follows:
- NOTSET (0): This means that all messages will be logged.
- Debug (10): Useful for diagnosing issues in the code.
- Info (20): It can act as an acknowledgment that there are no bugs in the code. One good use-case of Info level logging is the progress of training a machine learning model.
- Warning (30): Indicative of a problem that could occur in the future. For example, a warning of a module that might be discontinued in the future or a low-ram warning.
- Error (40): A serious bug in the code; it could be a syntax error, out-of-memory error, or exceptions.
- Critical (50): An error due to which the program might stop functioning or might exit abruptly.
import logging logging.info('This message is just informative') logging.debug('My missions is to provide information about debugging.') logging.critical("A Critical Logging Message")
Having said this, below you can find a list of six best practices when dealing with logging in Python.
Use logs instead of prints
We previously said that logs provide information in the same vein as the print functions. However, logs are way more powerful, as they can provide more granular information.
Going for the Print function may be tempting, especially if you are not familiar with logging routines, but logs will always be the safest option. They are better suited to scale and handle complex applications.
The web is full of guides and documentation. For example, this DataCamp logging tutorial may be what you are looking for to get started.
Use the logging module
This module is the go-to option for most Python developers. This means that the module is well maintained and backed by a huge community that will always have an answer to your doubts.
Choose the logging level wisely
The logging module comes with six different levels of messages. Each of them is designed for a specific purpose. The more you stick to them, the easier it will be for you and users to understand what it’s going on in your code.
Use timestamps when logging
This is a critical feature of logs that print functions don’t have. In addition to knowing where a problem appeared, it’s important to know when it happened. Timestamps are your biggest allies in these situations. Make sure to use the standard format to write timestamps, i.e., the ISO-8601 format.
Empty the logging bin
Use RotatingFileHandler classes, such as the TimedRotatingFileHandler, instead of FileHandler, to archive, compress, or delete old log files to prevent memory space issues. These classes will clean space once a size limit or a time condition is triggered.
Python commenting best practices
Comments are very important to make annotations for future readers of our code. Although it is difficult to define how the code should be commented on, there are certain guidelines that we can follow:
- Any comment that contradicts the code is worse than no comment. That is why it is very important that we update the code, and do not forget to update the comments to avoid creating inconsistencies.
- Comments must be complete sentences, with the first letter capitalized.
- Try to write comments in English. Although everyone is free to write their comments in the language they consider appropriate, it is recommended to do so in English.
- Ensure that your comments are clear and easily understandable to other speakers of the language you are writing in.
There are two types of comments: block comments and inline comments.
A block comment explains the code that follows it. Typically, you indent a block comment at the same level as the code block. Each line of a block comment starts with a # and a single space, as follows:
# This is a block comment print('Welcome to DataCamp!")
On the other hand, inline comments appear at the same level as code. Inline comments should be separated by at least two spaces from the statement. Similar to a block comment, an inline comment begins with a single hash sign (#) and is followed by a space and a text string.
print ("Have you ever tried Python?") # This is an inline comment
Python docstring best practices
A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Typically, you use a documentation string to automatically generate the code documentation. As such, a docstring can be accessed at run-time using the obj.__doc__ special attribute of that object.
For consistency, docstrings are always enclosed in triple double quotes (""").
There are two forms of docstrings: one-liners and multi-line docstrings. One-liners are for really obvious cases. They should really fit on one line. The following example illustrates a one-line docstring in the multiply() function:
def multiply(x, y): """ Return the product of two numbers """ result = x * y return result
On the other hand, a multi-line docstring can span multiple lines. Such a docstring should document the script’s function and command line syntax, environment variables, and files. Usage messages can be fairly elaborate and should be sufficient for a new user to use the command properly.
def calculate_salary(hours, price=20): """ Return the salary according to the hours worked Arguments: hours: total hours investe price: price of each hours worked. Minimum price is 20 dollars """ salary = hours * price return salary
A more detailed explanation of docstrings can be found in PEP257.
Python documentation best practices
While comments are important for other developers you are working with to understand your code, documentation is intended to help a community of potential users how to use your software. This is a critical step: no matter how good your software is, if the documentation is not good enough, or even worse, missing, people will not use it.
Especially if you are working on a new project, it’s always a good practice to include a README file. These files are normally the main entry point for readers of your code. They include general information to both users and maintainers of the project.
While concise, the README file should explain the project's purpose, the URL of the software's main source, and credit information.
Equally, you should always include a setup.py file, which ensures that the software or library has been packaged and distributed with Distulils, which is the standard for distributing Python modules.
Moreover, if your software requires other dependencies or packages to run, you should include a requirements.txt files, which describes the dependencies needed and their versions.
Finally, you are free to add any other information you may think it’s relevant to potential users. For example, a good practice is to provide examples of how your package or functions work.
Python virtual environment best practices
To ensure order and consistency across your data projects, a good practice is to create a virtual environment for every project you start.
Virtual environmental, also known as virtualenvs, helps decouple and isolate versions of Python and the libraries required for your project, and their related pip versions. This authorizes end-users to install and manage their own set of software and versions that is independent of those provided by the system.
To create virtualenvs, we need to install first the required package, writing the following line of code in your terminal:
pip install virtualenv
To create a new virtualenv, the following code will make the magic. We are calling the new virtualenv “new_venv”
python3 -m venv /path/to/new/virtual/environment/new_venv
Python Naming Conventions
One of the most common mistakes among beginner Python developers is improper naming of variables, classes, functions, etc. Naming conventions are critical elements to ensure the readability of code.
In Python, the standard way of naming objects is explained in PEP 8. According to the guidelines, new modules and packages (including third-party frameworks) should be written to these standards, but where an existing library has a different style, internal consistency is preferred.
In general, you should avoid using names that are too general or too worldly. Also, identifiers used in the standard library must be ASCII compatible as described in PEP 3131.
# Bad naming: data_structure, ríos_españa, dictionary_with_countries_and_capitals # Good naming: user_profile, stop_words, global_emissions_df
Below you can find a list with the naming convention for the most common objects in Python.
- Packages and modules
- Package and modules names should be all lower case
- Underscores can be used in the package and module name only if it improves readability.
- Class names should follow the UpperCaseCamelCase convention
- Python’s built-in classes, however, are typically lowercase words
- Instance Variables, methods, and functions
- Instance variable names should be all lower case
- Use underscore to add new words to instances variables
- Non-public instance variables should begin with a single underscore
- Constant names must be fully capitalized
- Use underscore to add new words to constants
Python Best Practices FAQs
Why following coding best practices is important?
To ensure readability, quality code, and scalability during software development activities
What is a virtual environment?
A virtual environment is a tool that separates the dependencies of different projects by creating a separate isolated environment for each project.
What is PEP 8?
It is a document that provides standard guidelines and best practices on how to write Python code.
What is indentation?
Indentation refers to the spaces at the beginning of a code line. Python uses indentation to indicate a block of code.
What is the Zen of Python?
The Zen of Python is a collection of 19 "guiding principles" for writing computer programs that influence the design of the Python programming language.
What is a docstring?
A Python docstring is a string used to document a Python module, class, function, or method, so programmers can understand what it does without having to read the details of the implementation
How many types of comments are common in Python?
There are two types of comments: block comments and inline comments.
Courses for Python
Writing Efficient Python Code
The 23 Top Python Interview Questions & AnswersEssential Python interview questions with examples for job seekers, final-year students, and data professionals.
Working with Dates and Times in Python Cheat SheetWorking with dates and times is essential when manipulating data in Python. Learn the basics of working with datetime data in this cheat sheet.
DataCamp Team •
Plotly Express Cheat SheetPlotly is one of the most widely used data visualization packages in Python. Learn more about it in this cheat sheet.
DataCamp Team •
Getting started with Python cheat sheetPython is the most popular programming language in data science. Use this cheat sheet to jumpstart your Python learning journey.
DataCamp Team •
Python pandas tutorial: The ultimate guide for beginnersAre you ready to begin your pandas journey? Here’s a step-by-step guide on how to get started. [Updated November 2022]
Vidhi Chugh •