Skip to main content
HomeBlogData Literacy

How to Build a Winning Data Team

Your organization is generating countless data points, and this data hides many actionable insights that can improve your company’s performance. Learn how to build a winning data-driven team that leverages the right data science skills.
Jul 2021  · 8 min read

Organizations Require Winning Data Teams

Today’s organizations are generating more data than ever before. Forbes claimed over 2.5 quintillion bytes of data were generated every day in 2018 with over 90 percent of the data in the world being generated in the two years before the article was published. Individual countries are generating even more data. In 2018, CNBC reported that China generated 7.6 zettabytes of data. This is expected to scale to 48.6 ZB in 2025. The U.S. generates similar amounts of data as well.

There are a lot of actionable insights available in this data that are not taken advantage of. According to Forrester, 60 to 73 percent of an organizations’ data is not leveraged in analytics.

Successful, data-driven organizations are leveraging data at scale to generate value. For example, Uber invested heavily in the creation of a platform to efficiently deliver the over 100 petabytes of available data to their data teams through a simple interface which they scaled to deliver over 1 billion Uber Eats orders with over 24 million miles covered. Through A/B testing, Netflix generates 20 to 30 percent more views by changing the picture associated with a movie or TV show.

There are clearly high-value insights available to organizations that can successfully navigate their complex and large data landscapes. This type of value generation can only be achieved through a high-performing and comprehensive data team.

Data Roles for a Winning Data Team

Many key roles go into building a successful data team. In this white paper, DataCamp describes eight roles, or personas, that can be found in any data-driven organization. While job titles may differ from one organization to another, we’ll outline five of the roles that make up a strong data team: business analyst, data analyst, data scientists, machine learning scientists, and data engineers.

Business Analyst

Business Analysts increase profitability and efficiency from data insights. They supplement their deep knowledge of the business domain with data analysis and visualization skills and report on insights to data consumers.

Key Skills: Data Manipulation, Data Visualization, Reporting, Basic Statistics Tools: Spreadsheets (Excel, Google Sheets), Business Intelligence tools (Tableau, PowerBI), SQL

Course Recommendations: Introduction to SQL, Data Analysis in Spreadsheets / Excel, Marketing Analytics in Spreadsheets, Financial Analytics in Spreadsheets

Data Analyst

Data Analysts play a similar role to business analysts in analyzing and drawing insights from data to drive business outcomes. Therefore, their skills also overlap; however, data analysts answer less defined problems that require a higher understanding of the data analysis workflow, and leverage a combination of coding and non-coding tools

Key Skills: Data Manipulation, Data Visualization, Reporting, Importing and Cleaning Data, Probability, and Statistics

Tools: R or Python, Spreadsheets (Excel, Google Sheets), Business Intelligence tools (Tableau, PowerBI), SQL

Course Recommendations: Data Analyst Career in Track (R - 16 courses/Python - 16 courses), Cleaning Data in R/Python, Time Series Analysis in SQL

Data Scientist

Data Scientists play a significantly more technical role in organizations, working mostly with coding tools to investigate, extract, and produce insights and value with data. Data scientists require a strong understanding of data analysis and machine learning workflows and the ability to work with non-standard data types and big data tools.

Key Skills: Data Manipulation, Data Visualization, Reporting, Importing and Cleaning Data, Probability and Statistics, Machine Learning,

Tools: R, Python, Scala, Big data tools (Airflow, Spark), SQL, Command-line tools (Git, Shell)

Course Recommendations: Data Scientist Career Track (R - 22 courses/Python - 23 courses), Sentiment Analysis in Python, Introduction to Git

Machine Learning Scientist

Machine Learning Scientists are responsible for developing machine learning systems at scale. They derive predictions from data using machine learning models of all types to solve problems like predicting churn and customer lifetime value, and are responsible for deploying these models for the organization to use.

Key Skills: Data Manipulation, Data Visualization, Importing and Cleaning Data, Probability and Statistics, Machine Learning, Data Engineering Tools: R, Python, Scala, Big data tools (Airflow, Spark), SQL, Command-line tools (Git, Shell) Course Recommendations: Machine Learning Scientists Career Track (R - 14 courses/Python - 23 courses), Image Processing in Python, Machine Learning with PySpark

Data Engineers

Data engineers are responsible for creating data pipelines that help organizations get the correct data to the right people. They combine large amounts of data from different sources into one centralized location, enabling the various data roles to work with clean, relevant, compliant, and actionable data.

Key Skills: Data Manipulation, Importing and Cleaning Data, Data Engineering, Advanced Programming Tools: Python, Scala, Big data tools (Airflow, Spark), SQL, Command-line tools (Git, Shell), Cloud Platforms (e.g., AWS) Course Recommendations: Data Engineer with Python Career Track (25 courses), Introduction to Airflow in Python, Streaming Data with AWS Kinesis and Lambda

How to Create a Winning Data Team

Case Study

With a high-level understanding of the responsibilities of each role, let’s now examine how each of these roles may interact in a business setting to drive value by using a real-world example, where a data team extracts value from a customer churn model.

In this context, data engineers ensure that data scientists and machine learning scientists have access to the high quality data they need to develop and operationalize the model. They would ensure the quality of the data and the correct permissions for each dataset are enforced. They would also deliver the data in an easy to access way with necessary metadata and variables for an effective analysis.

Next, data scientists and machine learning scientists would work together to create an accurate model to predict customer churn. They would need to ensure the model is accurate, interpretable, and can be deployed within a business process, and work to ensure the model remains accurate on unseen data once it is deployed.

Finally, data analysts and business analysts would work together to leverage the outputs of the model to make decisions that drive business value. They can allocate marketing spend based on which customers are more likely to churn, and provide fact-based reasoning to why a certain segment of customers is more likely to churn to decision makers.

Like the majority of data projects, each role plays a part in extracting value from data.

The path towards an effective data team starts with your people

Data positions are in high demand despite the recent hiring decreases caused by the pandemic, ranking twice in the top three of LinkedIn’s annual emerging jobs report. At the end of 2020, Deloitte claimed that 23% of organizations had a major or extreme gap between AI needs and current abilities. In 2021, Deloitte published a report arguing that to face this shortage in data position hiring, organizations must practice selective hiring and targeted upskilling.

Given the data talent shortage, organizations must combine selective hiring with upskilling to create highly skilled data teams that can create value. Organizations can do this by providing personalized learning paths for their people specifically tailored for these roles. At DataCamp, we provide learners personalized learning journeys based on their skill set and desired learning outcomes, enabling them with the tools to assess, learn, practice, and apply new data skills. With DataCamp for Business, learners can benefit from custom learning tracks that enable organizations to tailor learning programs to their specific goals and challenges. Learn more on how you can transform your talent with DataCamp.

DataCamp for Business provides an interactive learning platform for companies that need to upskill and reskill their people on data skills. With topics ranging from data literacy, and data science to data engineering and machine learning, over 1,600 companies trust DataCamp for Business to upskill their talent.

A Data Competency Framework from The State of Data Literacy Report 2023

Closing the Data Literacy Gap: Key Insights from the State of Data Literacy 2023 Report

Explore the growing importance of data literacy, insights from the State of Data Literacy 2023 Report, and best practices for implementing data literacy programs.

Matt Crabtree

7 min

DataCamp data on learner upskilling

The Data Literacy Imperative: Why Upskilling in Data is Essential for Your Career

Discover the importance of data literacy for your career growth, job market potential, and societal impact. Learn how to upskill in data and stay ahead in the competitive professional landscape.
Matt Crabtree's photo

Matt Crabtree

7 min

The Future of Data Literacy: A Fundamental Skill Shaping Society

Uncover the role of data literacy in the 21st century.
Matt Crabtree's photo

Matt Crabtree

7 min

Building the Case for Data Literacy

Valerie Logan shares insights on what a successful data literacy journey looks like.
Adel Nehme's photo

Adel Nehme

38 min

Scaling the Data Culture at Salesforce

Laura Gent Felker, Director of Data Insights and Scalability at Salesforce, talks about her experience in building and leading data teams within the organization over the last ten years.
Adel Nehme's photo

Adel Nehme

40 min

The Data-Information-Knowledge-Wisdom Pyramid

The Data-Information-Knowledge-Wisdom (DIKW) pyramid illustrates the progression of raw data to valuable insights.
Richie Cotton's photo

Richie Cotton

6 min

See MoreSee More