Skip to main content

Storytelling for More Impactful Data Science

The ability to tell a high-quality story about data insights is hugely important for data scientists to thrive in an organization. Find out how to leverage storytelling to scale the impact of data science in your organization.
Apr 2021  · 7 min read

The ability to tell a high-quality story about an analysis or model is a key driver for adopting the solutions data scientists develop. In a recent webinar, Gert De Geyter, machine learning lead at Deloitte and Bhavya Dwivedi, data scientist at Deloitte discuss the importance of storytelling for data science and how to do it effectively.

Storytelling Enables More Impactful Data Science

Data scientists investigate, analyze, and communicate data insights in the form of descriptive, predictive, and prescriptive analytics. Broadly, there are four competency areas to become an effective data scientist:

  1. Data analysis
  2. Machine learning and statistics
  3. Big data technologies
  4. Storytelling and communication

Data scientists with formal education are heavily exposed to the first three but do not receive much education in communicating effectively. In the webinar, Gert mentions that by running rough analysis over data science and AI conferences, just 1% of data science and AI conferences from 2018 to 2020 discussed storytelling.

Despite this underrepresentation of data science storytelling in education, data scientists must learn these skills if they want to become effective in their careers. In Building the AI-Powered Organization, researchers found that 89% of companies with successful A.I. implementations that participated in their survey spent more than half of their analytics budgets on adoption activities, which included communication and training.

Data literacy plays an important part in aligning data science with non-technical stakeholders. This explains why organizations are spending so much on communication and training. Accenture found that 84% of business executives believe A.I. is required to achieve growth objectives. However, many executives do not have formal education in this field, further increasing the need for improved data literacy and effective communication throughout the organization.

As AI becomes increasingly more prevalent, understanding data and communicating it correctly to the audience is essential to driving value in business settings with technical and non-technical stakeholders. Accurately describing the potential for A.I. to drive business outcomes in a specific context is crucial to allow stakeholders to make correct decisions that lead to positive business outcomes. A very common pitfall is to exaggerate its potential.

Communication is necessary for scaling the impact of data science in an organization—as end-user acceptance and alignment between executives and data teams are essential. The most effective form of this is storytelling. Storytelling helps the audience connect with the presenter and remember the content. This connection is supported by neural coupling theory. Stories also play an important role in how we remember the past—through effective storytelling, we help the audience connect with us as a presenter and remember the important content, which leads to a better understanding of how a model or analysis can impact the organization.

Understand Your Audience

The first step to effective storytelling is understanding the audience. A great story needs to be accessible to the people you are communicating with. A strong understanding of the audience’s data fluency, goals for the organization, and relationship to you and the organization should be heavily integrated into the story.

Answers to these questions will help inform how to best tell a story that drives the intended outcomes from the presentation. They help us as presenters know what to emphasize, how to communicate so that the audience understands the content, and what types of information will help lead to the desired outcome based on what motivates the audience in making decisions. When the audience is very broad, it’s important to identify the specific people in the audience the presentation is intended to reach.

Understanding the audience helps inform the order in which the story is told. There are three key parts of effective storytelling with data:

  1. The goal and message,
  2. Context and
  3. Arguments, and flow.

Generally, technical stakeholders prefer the story to be told starting with the arguments and ending with the conclusion. Decision-makers prefer the message to start with the recommendation or conclusion and end with the arguments and support.

Make the Story Transparent and Actionable

Understanding what story is being told is also important. IBM found that 68% of business leaders think that customers will want improved explanations from A.I. in the next three years. Ease of explanation is heavily impacted by the decision to use a black box or white box model.

Black box models typically perform better but are significantly (if not impossibly) harder to interpret. White box models are simpler and typically do not perform as well but are a lot easier to explain. When possible, using white box models will lead to better stories. However, some high-dimensional problems are not possible to accurately model using white box techniques. Thus, the business context is important in making this decision.

Accurately describing what a model’s output represents is also very important. Many models describe correlative associations between the response and predictors, not causal. Describing this distinction to non-technical stakeholders is critical to making the correct data-driven decision that will create value for an organization. A model’s prediction does not represent the best possible business decision and this needs to be included in an effective story. If there is a causal relationship, this must be accurately portrayed as well.

How to Tell an Effective Story

Now that we understand the key considerations about the audience and content of the story, we can effectively tell the story. While there are entire books on this topic, we will cover the basics in this section.

The first thing to consider is what format the presentation will be in: written or live presentation. Written presentations give the presenter less control over how the audience perceives the information while presenters in a live demonstration can point the audience in the intended direction verbally. Understanding how people’s attention typically works clarifies the best placement for content on the page to guide the intended experience.

In Cole Nussbaumer’s book, Storytelling with data, she emphasizes the importance of minimalism in visualizations. Removing the background and unnecessary labels and using color for emphasis help us tell significantly clearer stories by directing our audience’s attention to what matters. As we see in the below example from her book, by improving the default excel graph (on the left), we immediately understand that after two employees quit, the organization was receiving more volume than it could process. This story helps HR understand that two more employees need to be hired and the exact reason for this. This visualization tells a powerful story that helps drive a data-driven business decision with very little cognitive effort for the decision-maker.

This is a powerful example that demonstrates what good storytelling looks like. We can significantly change the way someone approaches our visualization through color, annotations, and graph selection.

Make Data Science Accessible in Organizations

We see through many large-scale executive surveys how much emphasis organizations are placing on data science and artificial intelligence as soon as possible. Making these systems accessible begins with improved data literacy in organizations. Data literacy allows non-technical stakeholders to easily access and make data-driven decisions that create value. The most effective way to present this type of information is through stories that take the audience and context into account while making design decisions that attract attention to the correct places. If you want to dive deeper into best practices for effective storytelling for data science, make sure to watch the full webinar recording.

Data Science Concept Vector Image

How to Become a Data Scientist in 8 Steps

Find out everything you need to know about becoming a data scientist, and find out whether it’s the right career for you!

Jose Jorge Rodriguez Salgado

12 min

YOLO Object Detection Explained

Understand YOLO object detection, its benefits, how it has evolved over the last couple of years and some real-life applications.
Zoumana Keita 's photo

Zoumana Keita

5 Ways to Use Data Science in Marketing

Discover five ways you can use data science in marketing. Get ahead of the game, improve your data skills, and work on a data science marketing project.
Natassha Selvaraj's photo

Natassha Selvaraj

DC Data in Soccer Infographic.png

How Data Science is Changing Soccer

With the Fifa 2022 World Cup upon us, learn about the most widely used data science use-cases in soccer.
Richie Cotton's photo

Richie Cotton


The Deep Learning Revolution in Space Science

Justin Fletcher joins the show to talk about how the US Space Force is using deep learning with telescope data to monitor satellites, potentially lethal space debris, and identify and prevent catastrophic collisions. 

Richie Cotton's photo

Richie Cotton

53 min

Regular Expressions Cheat Sheet

Regular expressions (regex or regexp) are a pattern of characters that describe an amount of text. Regular expressions are one of the most widely used tools in natural language processing and allow you to supercharge common text data manipulation tasks. Use this cheat sheet as a handy reminder when working with regular expressions.
DataCamp Team's photo

DataCamp Team

See MoreSee More