Start Learning for Free

Join over 1,000,000 other Data Science learners and start one of our interactive tutorials today!

Topic visualization small

Python Seaborn Tutorial For Beginners

August 10th, 2017 in Data Visualization

Seaborn: Python's Statistical Data Visualization Library

One of the best but also more challenging ways to get your insights across is to visualize them: that way, you can more easily identify patterns, grasp difficult concepts or draw the attention to key elements. When you’re using Python for data science, you’ll most probably will have already used Matplotlib, a 2D plotting library that allows you to create publication-quality figures. Another complimentary package that is based on this data visualization library is Seaborn, which provides a high-level interface to draw statistical graphics.

Today’s post will cover some of the most frequently asked questions users had while they started out working with the Seaborn library. How many of the following questions can you answer correctly?

Interested in a course that covers Matplotlib and Seaborn? Take DataCamp’s Introduction to Data Visualization with Python.

Seaborn vs Matplotlib

As you have just read, Seaborn is complimentary to Matplotlib and it specifically targets statistical data visualization. But it goes even further than that: Seaborn extends Matplotlib and that’s why it can address the two biggest frustrations of working with Matplotlib. Or, as Michael Waskom says in the “introduction to Seaborn”: “If matplotlib “tries to make easy things easy and hard things possible”, seaborn tries to make a well-defined set of hard things easy too.”

One of these hard things or frustrations had to do with the default Matplotlib parameters. Seaborn works with different parameters, which undoubtedly speaks to those users that don’t use the default looks of the Matplotlib plots.

Compare the following plots:

Python Seaborn tutorial
Python seaborn violinplot

The Matplotlib defaults that usually don’t speak to users are the colors, the tick marks on the upper and right axes, the style,…

The examples above also makes another frustration of users more apparent: the fact that working with DataFrames doesn’t go quite as smoothly with Matplotlib, which can be annoying if you’re doing exploratory analysis with Pandas. And that’s exactly what Seaborn addresses: the plotting functions operate on DataFrames and arrays that contain a whole dataset.

As Seaborn compliments and extends Matplotlib, the learning curve is quite gradual: if you know Matplotlib, you’ll already have most of Seaborn down.

If you feel your matplotlib skills are rusty, check out the following resources:

How To Load Data To Construct Seaborn Plots

When you’re working with Seaborn, you can either use one of the built-in data sets that the library itself has to offer or you can load a Pandas DataFrame. In this section, you’ll see how to do both.

Loading A Built-in Seaborn Data Set

To start working with a built-in Seaborn data set, you can make use of the load_dataset() function. To get an overview or inspect all data sets that this function opens up to you, go here. Check out the following example to see how the load_dataset() function works:

Seaborn Python Tutorial

As an anecdote, it might be interesting for you to know that the import convention sns comes from the fictional character Samuel Norman “Sam” Seaborn on the television serial drama The West Wing. It’s an inside joke by the core developer of Seaborn, namely, Michael Waskom.

Loading Your Pandas DataFrame Getting Your Data

Of course, most of the fun in visualizing data lies in the fact that you would be working with your own data and not the built-in data sets of the Seaborn library. Seaborn works best with Pandas DataFrames and arrays that contain a whole data set.

Remember that DataFrames are a way to store data in rectangular grids that can easily be overviewed. Each row of these grids corresponds to measurements or values of an instance, while each column is a vector containing data for a specific variable. This means that a DataFrame’s rows do not need to contain, but can contain, the same type of values: they can be numeric, character, logical, etc. Specifically for Python, DataFrames come with the Pandas library, and they are defined as a two-dimensional labeled data structures with columns of potentially different types.

The reason why Seaborn is so great with DataFrames is, for example, because labels from DataFrames are automatically propagated to plots or other data structures, as you saw in the first example of this tutorial, where you plotted a violinplot with Seaborn. There, you saw that the x-axis had a legend total_bill, while this was not the case with the Matplotlib plot. This already takes a lot of work away from you.

But that doesn’t mean that all the work is done -quite the opposite. In many cases, you’ll need to still manipulate your Pandas DataFrame so that the plot will render correctly. If you want to know more, check out DataCamp’s Pandas Tutorial on DataFrames in Python or the Pandas Foundations course.

How To Show Seaborn Plots

Matplotlib still underlies Seaborn, which means that the anatomy of the plot is still the same and that you’ll need to use to make the image appear to you. You might have already seen this from the previous example in this tutorial. In any case, here’s another example where the show() function is used to show the plot:


Note that in the code chunk above you work with a built-in Seaborn data set and you create a factorplot with it. A factorplot is a categorical plot, which in this case is a bar plot. That’s because you have set the kind argument to "bar". Also, you set which colors should be displayed with the palette argument and that you set the legend to False.

How To Use Seaborn With Matplotlib Defaults

As you read in the introduction, the Matplotlib defaults are something that users might not find as pleasing than the Seaborn defaults. However, there are also many questions in the opposite direction, namely, those use Seaborn and that want to plot with Matplotlib defaults.

Before, you could solve this question by importing the apionly module from the Seaborn package. This is now deprecated (since July 2017). The default style is no longer applied when Seaborn is imported, so you’ll need to explicitly call set() or one or more of set_style(), set_context(), and set_palette() to get either Seaborn or Matplotlib defaults for plotting.


How To Use Seaborn’s Colors As A colormap in Matplotlib?

Besides using Seaborn with Matplotlib defaults, there’s also questions on how to bring in Seaborn colors into Matplotlib plots. You can make use of color_palette() to define a color map that you want to be using and the number of colors with the argument n_colors. In this case, the example will assume that there are 5 labels assigned to the data points that are defined in data1 and data2, so that’s why you pass 5 to this argument and you also make a list with length equal to N where 5 integers vary in the variable colors.


Tip: do you need to revise NumPy? Consider this NumPy Tutorial or the NumPy cheat sheet.

How To Scale Seaborn Plots For Other Contexts

If you need your plots for talks, posters, on paper or in notebooks, you might want to have larger or smaller plots. Seaborn has got you covered on this. You can make use of set_context() to control the plot elements:


The four predefined contexts are "paper", "notebook", "talk" and "poster". Tip: try changing the context in the DataCamp Light chunk above to another context to study the effect of the contexts on the plot.

You can also pass more arguments to set_context() to scale more plot elements, such as font_scale or more parameter mappings that can override the values that are preset in the Seaborn context dictionaries. In the following code chunk, you overwrite the values that are set for the parameters font.size and axes.labelsize:


Note that in the first code chunk, you have first done a reset to get the default Seaborn parameters back. You did this by calling set(). This is extremely handy if you have experimented with setting other parameters before, such as the plot style.

Additionally, it’s good to keep in mind that you can use the higher-level set() function instead of set_context() to adjust other plot elements:


One of the hardest things about data visualizations is customizing the graphs further until they meet your expectations and this stays the same when you’re working with Seaborn. That’s why it’s good to keep in mind the anatomy of the Matplotlib plot and also what this means for the Seaborn library.

As for Seaborn, you have two types of functions: axes-level functions and figure-level functions. The ones that operate on the Axes level are, for example, regplot(), boxplot(), kdeplot(), …, while the functions that operate on the Figure level are lmplot(), factorplot(), jointplot() and a couple others.

This means that the first group is identified by taking an explicit ax argument and returning an Axes object, while the second group of functions create plots that potentially include Axes which are always organized in a “meaningful” way. The Figure-level functions will therefore need to have total control over the figure so you won’t be able to plot an lmplot onto one that already exists. When you call the Figure-level functions, you always initialize a figure and set it up for the specific plot it’s drawing.

You can easily see this when you make a boxplot and an lmplot, for example:

>>> sns.boxplot(x="total_bill", data=tips)
<matplotlib.axes._subplots.AxesSubplot object at 0x117e8da20>

>>> sns.lmplot('x', 'y', data, size=7, truncate=True, scatter_kws={"s": 100})
<seaborn.axisgrid.FacetGrid object at 0x11fa03438>

However, you see that, once you’ve called lmplot(), it returns an object of the type FacetGrid. This object has some methods for operating on the resulting plot that know a bit about the structure of the plot. It also exposes the underlying figure and array of axes at the FacetGrid.fig and FacetGrid.axes arguments.

When you’re customizing your plots, this means that you will prefer to make customizations to your regression plot that you constructed with regplot() on Axes level, while you will make customizations for lmplot() on Figure level.

Let’s see how this works in practice by covering some of the following, most frequently asked questions:

How To Temporarily Set The Plot Style

You can use axes_style() in a with statement to temporarily set the plot style. This, in addition to the use of plt.subplot(), will allow you to make figures that have differently-styled axes, like in the example below:


How To Set The Figure Size in Seaborn

For axes level functions, you can make use of the plt.subplots() function to which you pass the figsize argument.


For Figure-level functions, you rely on two parameters to set the Figure size, namely, size and aspect:


How To Rotate Label Text in Seaborn

To rotate the label text in a Seaborn plot, you will need to work on the Figure level. Note that in the code chunk below, you make use of one of the FacetGrid methods, namely, set_xticklabels, to rotate the text label:


How To Set xlim or ylim in Seaborn

For a boxplot, which works at the Axes level, you’ll need to make sure to assign your boxplot to a variable ax, which will be a matplotlib.axes._subplots.AxesSubplot object, as you saw above. With the object at Axes level, you can make use of the set() function to set xlim, ylim,… Just like in the following example:


Note that alternatively, you could have also used ax.set_xlim(10,100) to limit the x-axis.

Now, for functions at Figure-level, you can access the Axes object with the help of the axes argument. Let’s see how you can use the ax argument to your advantage to set the xlim and ylim properties:


Likewise, FacetGrid exposes the underlying figure with the help of the fig argument.


How To Set Log Scale

You can modify the scale of your axes to better show trends. That’s why it might be useful in some cases to use the logarithmic scale on one or both axes. For a simple regression with regplot(), you can set the scale with the help of the Axes object.


When you’re working with Figure level functions, you can set the xscale and yscale properties with the help of the set() method of the FacetGrid object:


How To Add A Title

To add titles to your Seaborn plots, you basically follow the same procedure as you have done in the previous sections. For Axes-level functions, you’ll adjust the title on the Axes level itself with the help of set_title(). Just pass in the title that you want to see appear:


For Figure-level functions, you can go via fig, just like in the factorplot that you have made in one of the previous sections, or you can also work via the Axes:


Data Visualization in Python

Congratulations! You have finished this Seaborn tutorial for beginners.

If you are interested in interactive visualizations, check out DataCamp’s Interactive Data Visualization with Bokeh course! In this course, you’ll learn how to create diverse, rich, data-driven, and interactive visualizations with Bryan Van de Ven, developer of Bokeh and software engineer at Continuum Analytics. When you’re at it, also don’t miss out on DataCamp’s Bokeh cheat sheet.


No comments yet. Be the first to respond!