Skip to main content

How we switched to Workspace for our internal analytics

In this article, we will explore why and how Workspace has become the tool of choice for internal analytics at DataCamp.
Oct 2022  · 8 min read

When DataCamp Workspace first launched, we worked hard to ensure that it was the best possible experience for performing data science and data analytics work. This meant building a fast, lightweight editor, enabling real-time collaboration, and producing professional read-only reports.

At the same time, we relied on third-party tools for performing our internal analyses. Product metrics such as the number of users using Workspace and time spent on the platform were tracked using other tools. We realized that if we wanted to be serious about Workspace as a tool for professional work, we would need to use it ourselves.

This article will dive into how we conducted internal analytics in the past and how we transitioned to Workspace. It will also cover the advantages and learnings we gained from the transition.

Our pre-Workspace analytics

Before Workspace, our primary tool for internal analytics on the product was Metabase. Metabase is a business intelligence tool connected to a data lake containing information on Workspace content and activity. With Metabase, we could quickly run SQL queries, generate reports, and even create dynamic dashboards.

Metabase was used widely across the team. We used it to monitor how engaged users were with different types of content, how much time our users spent in Workspace, and how many users were interacting with Workspace on a daily, weekly, and monthly basis. 

Using Workspace for our analytics at this point was not that common. It only happened when we bumped into the limits of SQL or Metabase’s no-code visualization capabilities. In these instances, we would export the raw data, manually add it to a new workspace and then continue the analysis in Python. However, this process was cumbersome and made updating data a pain.

Why we transitioned to Workspace

As Workspace has grown, the questions we want to answer about our users and their time on Workspace have become increasingly complex. As convenient as Metabase can be, many of our questions have required the additional tools provided by Python and R.

We also knew that if we wanted to provide the best possible user experience, we would need to use it ourselves, a practice called “dogfooding”. Indeed, what better way to understand the limitations and frustrations of Workspace than to use it ourselves?

For these two reasons, we set the goal to transition the bulk of our analytics from Metabase to Workspace beginning in the summer of 2022.

The transition 

The transition began when we launched SQL cells inside of Workspace. Now, we were able to query the same data we accessed in Metabase, but with the query results instantly transformed into a Python or R DataFrame (depending on what tool we are using). We could easily switch between querying our database and analyzing the results with the tools of our choice. Within weeks, we migrated most of our existing queries from Metabase into Workspace.

As of the time of writing, we have 141 different DataCamp users who have spent 30 minutes or more inside of Workspace. As can be seen, by the chart, this has been a steady progress!

DataCamp Workspace at DataCamp

We are continually recruiting new DataCampers into Workspace!

The advantages of switching to Workspace

Since switching to Workspace, we have found several key advantages in our day-to-day work:

  1. Fully customizable visualizations: While Metabase offers many visualizations and customizable attributes, the possibilities are not endless. With Workspace, the results of our SQL queries are returned as DataFrames. This means we can use Python libraries such as Plotly to create charts with annotations, custom themes, and unique plot types.

DataCamp Workspace Usage

An example of one of the visualizations we have created inside Workspace by accessing our internal data.

  1. Advanced analytics: By combining SQL with Python or R, we can instantly analyze our data with techniques not possible in SQL. This has allowed us to do things such as segmenting our users and training classification models to predict how users are working inside of Workspace.
  2. Descriptive reporting: A dashboard is excellent for users familiar with the product. However, a dashboard can be challenging to interpret for newcomers or people outside of the team. With Workspace, we can embed written summaries throughout our reports that can be read later by relevant stakeholders.
  3. Shareable and social publications: The ability to share a link to a published workspace has been incredibly valuable. Published analyses have ended up in the hands of our CEO and other executives, who then add feedback in the form of comments and questions. Combined with the descriptive nature of published workspaces, management could read through analytics work at their leisure without an in-person briefing.
  4. No code charts: Most DataCampers have SQL skills, but not everyone is as comfortable in Python or R. Chart cells enable less technical users to communicate their insights using high-quality visualizations without writing a line of code.

DataCamp Workspace Visual Cells

Our no-code chart cells in action!

What we’ve learned so far 

As of mid-October, DataCampers have put a cumulative 4132 hours inside Workspace! We have now migrated all central reporting related to quarterly targets and product usage to Workspace. Below are a summary of our key learnings and how these have impacted the Workspace as a product:

  1. Speed is critical: It's annoying to wait longer than expected for your workspace to load, code to execute, or query to return results. The experience should be at least as fast and ideally faster than working in a locally running Jupyter Notebook. Our team has made considerable strides in reducing notebook and publication loading times. They have also worked to ensure that code execution is comparable to a local notebook.
  2. Not everyone wants to see the code: In the past, our publications would contain large cells full of complicated SQL queries and dense Python code. While this was useful for colleagues to review the technical aspects of a report, it often got in the way of the true insights of a report. You could hide cells and publications by switching to the JupyterLab editor in Workspace, but this wasn’t reflected in the DataCamp Notebook editor. Knowing this, we have released a way to effortlessly hide code and SQL cells in your workspace effortlessly. This allows you to ensure a beautiful read-only report you can share with any stakeholder. Our users love it.
  3. Re-running notebooks can be frustrating: For much of our tracking, we rely heavily upon daily updates to the number of users, activity levels, and feature adoption. Unlike a dynamic dashboard that refreshes when new data is available, workspaces must be re-run every time. Based on our own experiences, we are planning to support workspace execution scheduling. This will ensure that reports always contain the latest insights.

These learnings don't include the many minor tweaks we have made to the editor since we switched to Workspace. The user interface, dashboard, and features such as chart cells are undergoing continuous improvement, driven by feedback from users, both internal and external.

Going forward

While the process of using our own product has revealed many ways in which we can improve, we know that is only half the story. In combination with feedback we’ve received internally, our team has been hard at work interviewing Workspace users. These interviews complement our findings and help us catch what we may miss through internal testing.

We also launched a user survey in late September as part of our biweekly newsletter. This survey allowed us to collect quantitative information from our users. These results are helping us identify high-priority initiatives for Workspace and understand how our product is used. And, of course, the survey data was analyzed and visualized with Workspace!

DataCamp Workspace Use Cases

The primary reason our users reported using Workspace was to try out code samples quickly. How do you use Workspace?

In the coming months, we plan to move even further towards a Workspace-exclusive analytics environment. This transition becomes easier with each new feature and improvement our engineering team makes to the product.

If you’re interested in using Workspace for your own internal data analytics and data science needs, you can read more about it here. Or better yet, jump into an empty Python or R workspace and get coding now!

Get Started with DataCamp Workspace

A cloud-based notebook to experiment with code, analyze data, collaborate with others and share insights—no installation required.

Learn More

The Complete Docker Certification (DCA) Guide for 2024

Unlock your potential in Docker and data science with our comprehensive guide. Explore Docker certifications, learning paths, and practical tips.
Matt Crabtree's photo

Matt Crabtree

8 min

Mastering API Design: Essential Strategies for Developing High-Performance APIs

Discover the art of API design in our comprehensive guide. Learn how to create APIs like Google Maps API with best practices in defining methods, data formats, and integrating security features.

Javeria Rahim

11 min

Data Science in Finance: Unlocking New Potentials in Financial Markets

Discover the role of data science in finance, shaping tomorrow's financial strategies. Gain insights into advanced analytics and investment trends.
 Shawn Plummer's photo

Shawn Plummer

9 min

5 Common Data Science Challenges and Effective Solutions

Emerging technologies are changing the data science world, bringing new data science challenges to businesses. Here are 5 data science challenges and solutions.
DataCamp Team's photo

DataCamp Team

8 min

A Data Science Roadmap for 2024

Do you want to start or grow in the field of data science? This data science roadmap helps you understand and get started in the data science landscape.
Mark Graus's photo

Mark Graus

10 min

Introduction to DynamoDB: Mastering NoSQL Database with Node.js | A Beginner's Tutorial

Learn to master DynamoDB with Node.js in this beginner's guide. Explore table creation, CRUD operations, and scalability in AWS's NoSQL database.
Gary Alway's photo

Gary Alway

11 min

See MoreSee More