Skip to main content

Pip Python Tutorial for Package Management

Learn about Pip, a powerful tool that helps you properly manage distribution packages in Python.
Updated Feb 16, 2023  · 11 min read

If you are considering becoming a data scientist, the sooner you start learning how to code, the better. Data professionals spend a great deal of their time coding. Programming languages are the key tools that allow data professionals to analyze and extract meaningful insights from vast amounts of data. 

Probably the most popular programming language for data science is Python. Python is an open-source, general-purpose, and powerful programming language, with applications in many software domains, such as web development, game development, and, of course, data science. 

While Python itself alone is already capable of many cool things, data professionals –and, more broadly, software developers– often make use of additional packages –also known as libraries– to make their life easier. A package is a collection of related files, modules, and dependencies that can be used repeatedly in different applications and problems.

One of the key strengths of Python is its wide catalog of well-documented and comprehensive libraries. Where are these libraries hosted? How can you install and manage the packages of your interest? 

In this tutorial, you will be introduced to the world of packages in Python and pip, the standard package installer for Python. Pip is a powerful tool that will allow you to leverage and manage the many Python packages that you will come across as a data professional and a programmer.

Understanding Packages in Python

Let’s use a metaphor to understand what pip is. Imagine Python is a nice and balanced toolbox with the essential items you will need to code. When you buy (install) Python on your computer, it comes with a wide collection of additional tools (packages) that you can use anytime. 

The so-called Python Standard Library is an extensive set of built-in packages that provides standardized solutions for many problems that occur in everyday programming. Since these packages come bundled in modern Python distributions, you can use them without any additional installation required. You just have to “import” them to your working space (more on this coming up later).

However, sometimes you will not find the tool you are looking for in Python or its Standard Library. In these cases, you will need to get new tools elsewhere. Fortunately, the internet is a huge store where you can find hundreds of thousands of packages developed by Python developers for all kinds of purposes. 

And the best thing? The wide majority of these packages are free for use. If you want to know more about packages in Python and how to develop your own packages, check out our Developing Python Packages Course.  

While third-party packages can be hosted in different locations, the most popular and comprehensive repository is the Python Package Index (PyPi). With over 300,000 available Python packages, PyPI is a giant online repository of packages that are accepted by the Python community.

Once you have identified the package you are looking for, you will need to download and install it on your computer to use it. How can you do it?  Here is where package managers come into play.

Understanding package managers: pip

A package manager (also called a package-management system) is a tool that automates the process of installing, upgrading, configuring, and removing packages for a computer in a consistent manner. 

Package managers are designed to eliminate the need for manual installs and updates, thereby ensuring that a package is installed together with all the dependencies it requires to function. Equally, since package managers leverage the information stored on certified package repositories, like PyPi and Anaconda, they ensure the integrity and authenticity of packages.

The most popular package manager for Python is pip. Developed in 2008, pip (an acronym of “pip Install Packages”) is today the standard tool for installing Python packages and their dependencies in a secure manner. Most recent distributions of Python come with pip preinstalled. Python 2.7.9 and Python 3.4 and later versions include pip by default.

Pip is a powerful and user-friendly tool that allows you to manage Python packages using a handful of commands. Although pip uses PyPi as a default repository for fetching packages, it has also the capability to install packages from other sources, including:

  • Version control systems like Github, Mercurial, Subversion, and Bazaar.
  • Requirements files. Usually, Python packages require multiple packages to run. To install all the necessary packages at once, pip uses the so-called requirements.txt, which contains a list of the necessary packages, as well as the correct versions.
  • Distribution files. These are versioned, ready-to-install files containing Python packages, modules, and other resource files necessary for a package to function. They come in two forms:
    • Source distribution (usually shortened to “sdist”)
    • Wheel distribution (usually shortened to “wheel”)

In order to use pip and start managing packages, you first need to ensure that it's installed on your computer. To check whether pip is available, run one of the following statements in the command line:

>>pip3 --version

>>pip --version

pip vs pip3 vs pip2

After reading the previous section, you may be wondering what’s the difference between pip and pip3. Following the release of Python 3, pip incorporated the new pip3 command, which always operates on the Python 3 environment of your computer. The same goes for the pip2 command. So, if you want to make sure that pip operates on your Python 3 environment or your Python 2 environment, use the pip3 or pip2 commands, respectively.

By contrast, the pip command operates on whichever Python environment is appropriate to the context. This is relevant when you have both Python 2 and Python 3 installed on your computer. 

For example, MacOS computers rely on Python 2 to run some of its core functionalities.  If you are working on a Python 2 environment, the pip command will install, uninstall, upgrade, or manage Python packages for Python 2. The same applies if you are working on a Python 3 environment. 

However, in these situations, you should be certain of the Python environment you are working in before using the pip command, otherwise you may manipulate packages in the wrong environment.  

pip in action

Now that you know the basis of pip and have it installed on your computer, let’s see how you can use it!

Installing Packages with pip

The most common use of pip is installing Python packages. For example, if you want to install pandas, the standard package to manipulate data frames in Python, the simplest way to do it is by running the following instruction:

>>pip install pandas
[...]
Successfully installed pandas

You may need to install a package in a certain version. This is pretty easy with pip. You just have to specify the version you want to install. Say you want to install version 1.4.0 of pandas:

>>pip install pandas==1.4.0

In case you want to install a package meeting certain conditions regarding versions, pip allows you to use certain boolean conditions. For example, if you want to install a pandas version greater or equal to v.1.0.0 and less than 1.5:

>>pip install pandas>=1.0.0,<1.5.0

Installing the scikit-learn Package with pip

In the following example, you will learn how you can install the scikit-learn package, which will install the other necessary dependencies.

pip install scikit-learn

pip install scikit learn

You may notice from the logs that more than the scikit-learn package is being installed. This is because pip will install any other packages that scikit-learn depends on. These other packages are called dependencies.

Installing a List of Packages Using pip Requirements Files

When you work in collaborative projects, it’s very common that all members of the team use the same packages with the same versions. To ensure this, the best way is by installing packages using a requirements file. This is usually a text file that contains all the packages, along with their respective versions, that are used in the project. 

Pip allows you to install a list of packages at once using a requirements file. For example, if we need for our project the packages numpy, pandas, and TensorFlow, we could include them, along with the desired versions, in a requirements.txt file, as shown below:

pip install requirements

To install the packages listed in a requirements.txt file, we just need to run:

>>pip install -r requirements.txt

If you want to create a requirements file to share with the rest of the team, you can use the following instruction:

>>pip freeze > requirements.txt

Installing Packages from GitHub with pip

Besides PyPi, there are other sources on the Internet where Python packages can be hosted. Version control systems like GitHub include package repositories where you can upload and share packages with the Python community. 

Let’s say you want to install packages hosted on GitHub. Pip only needs a working executable to be available on a GitHub URL. For example:

>>pip install git+https://github.com/pypa/sampleproject.git@main

pip Upgrading Packages

Sometimes you will need to upgrade to a newer version of a package you have already installed on your computer. Pip makes this process extremely easy. For example, if you want to upgrade pandas to the latest version:

>>pip install --upgrade pandas

In case you want to upgrade all the packages listed on a requirements.txt, you could use:

>>pip install -r requirements.txt --upgrade

It’s important to note that pip also performs an automatic uninstall of an old version of a package before upgrading to a newer version.

Removing Packages with pip

Removing a package is very easy with pip. If you want to uninstall pandas, you just have to run:

>>pip uninstall pandas

Additional pip Commands 

While installing, upgrading, and removing packages are the most common actions you will do with pip, there are also other commands worth mentioning.

If you need extra information about the different pip commands available and how to use them, run:

>>pip help

To list all the packages installed on your environment:

>>pip list

To see a summary of a package of your interest:

>>pip show [NameOfPackage]

Troubleshooting pip

Although pip is a fairly simple package, with just a handful of commands, problems can always arise. Here is a list of the most common issues and how to fix them.

Installing pip

Although unusual, it’s possible that pip isn’t installed. In this case, the easiest way to install pip is by running the statement below. This will make Python trigger the built-in package ensurepip, which is designed to install pip in a Python environment. 

>>python3 -m ensurepip --default-pip

In case the problem persists, check out the pip documentation to try alternative solutions.

Pip is not up-to-date

New versions of a package can bring bug fixes, new features, and faster performance. This applies to every Python package, including pip. If you are using an older version of pip, you may experience unexpected behavior. That’s why it’s always recommended to keep pip up-to-date, as well as the setuptools and wheel packages, which are useful to ensure you can also install packages from source archives. 

>>pip install –upgrade pip setuptools wheel

Conclusion

We hope you enjoyed this tutorial! As the standard tool for installing packages, pip is a vital tool for Python developers. It’s fairly easy to use, which makes package management a straightforward process once you get familiar with the commands and the dynamic. 

If you are looking for additional resources, check out the following DataCamp materials and get started with Python and pip today!

Python pip FAQs

What is a package in Python?

A package is a collection of related files, modules, and dependencies that can be used repeatedly in different applications and problems.

What is a package manager?

A package manager is a tool that automates the process of installing, upgrading, configuring, and removing packages for a computer in a consistent manner.

What is pip?

pip is a standard package manager in Python.

How can you install a package with pip?

Use pip install [NameOfPackage]

How can you upgrade a package with pip?

Use pip install --upgrade [NameOfPackage]

How can you remove a package with pip?

Use pip uninstall [NameOfPackage]

Topics

Python Courses

course

Introduction to Python

4 hr
5.7M
Master the basics of data analysis with Python in just four hours. This online course will introduce the Python interface and explore popular packages.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

tutorial

Python Tutorial for Beginners

Get a step-by-step guide on how to install Python and use it for basic data science functions.
Matthew Przybyla's photo

Matthew Przybyla

12 min

tutorial

Python Setup: The Definitive Guide

In this tutorial, you'll learn how to set up your computer for Python development, and explain the basics for having the best application lifecycle.

J. Andrés Pizarro

15 min

tutorial

Installing Anaconda on Windows Tutorial

This tutorial will demonstrate how you can install Anaconda, a powerful package manager, on Microsoft Windows.
DataCamp Team's photo

DataCamp Team

5 min

tutorial

How to Upgrade Python and Pip in Windows, MacOS, and Linux

Read our step-by-step instructions for how to upgrade Pip and Python on Windows, macOS, and Linux. Keep your environment up-to-date and compatible.
Samuel Shaibu's photo

Samuel Shaibu

11 min

tutorial

Installing Anaconda on Mac OS X

This tutorial will demonstrate how you can install Anaconda, a powerful package manager, on your Mac.
DataCamp Team's photo

DataCamp Team

7 min

tutorial

Tuples in Python Tutorial

Learn to use and unpack Tuples in Python.
DataCamp Team's photo

DataCamp Team

3 min

See MoreSee More