What’s one thing that successful tech companies like Amazon, Netflix, Google, and Airbnb have in common? They’ve all scaled and operationalized their data science function. This means they’ve built top-notch data teams, and they’ve procured or built their own data tools. This required designing powerful internal processes and architectures that leverage data to move their business forward.
In our webinar last month, Maksim Percherskiy, Data Engineer at The World Bank and former Chief Data Officer of the City of San Diego, explained that these companies were well positioned to do so because they grew up with data. “Older organizations—like financial institutions, healthcare institutions, and governments—didn't have the privilege [of building their business around data]. The city of San Diego was founded in the 1800s, and they're still learning to adjust and they're still learning how to effectively align data to their operations.”
Let’s unpack how large organizations can build an effective data science function. In recent years, many organizations have hired chief data officers, built teams of data scientists and data engineers, and found a way to monetize and share data. But NewVantage Partners’ in January 2020 revealed that out of 70 firms surveyed, 98.8% had been investing in big data and AI initiatives, but only 37.8% were able to claim that they’d created a data-driven organization.
The most common barriers to becoming data-driven were people and process challenges—namely, operationalizing and building repeatable processes, creating a functional data infrastructure, and bridging the data skills gap.
The following visual summarizes what makes data usable and trustworthy:
First, data must be collected at the right time, in the right manner, and with a purpose. Then, it must be discoverable, meaning all data users across the organization must know that this data has been collected and be able to find it. Next, the data must prove to be reliable, with no gaps or inconsistencies. It must also be easily understood—for instance, it should be correctly structured and labeled. Being compliant is critical—the proper security protocols must be in place to control access to sensitive data, and some industries have regulatory standards. And lastly, data must be actionable, meaning data users have the technology, training, and ethical frameworks to use data correctly.
Challenges to becoming data-driven
There are three buckets of challenges to meeting these requirements of data: organizational, cultural, and technical.
Large organizations often have silos in which teams may have different objectives. These silos may be in place due to legal requirements, as is common in finance, or they may exist simply because teams may not have worked collaboratively in the past. Whatever the cause, silos cause miscommunication and misalignment, which harms the overall effectiveness of the organization.
A symptom of this could be managers who want to build bigger teams or the creation of more than one AI center of excellence.
Organizational data literacy—in which everyone has the data skills to be successful —is often thought of as a lofty goal, and some organizations may be risk averse to tackle it. This is why DataCamp tries to demystify topics like machine learning to help organizations understand how data can be of practical use. Empowering employees and making it easy to build skills is key.
Cultural challenges can also manifest in different incentives across the organization. Management may want eye-catching dashboards to present to their superiors, engineers want to use the latest and greatest tools, and business users may just want to answer a specific question or streamline processes. It’s not always possible to make every stakeholder happy. This ties into the organizational challenges mentioned previously—the goal is to align data strategy with overall business strategy.
Maksim says that large organizations often have “a patchwork of legacy systems” as the core systems of record for data. These systems can be difficult to pull data out of or scale—but often, department leads don’t want to take on a big project. They’re afraid of incurring the switching costs of replacing them, potential project failure costs, and additional retraining costs and a short-term hit to productivity. These are all shortsighted errors that result from a lack of unified data strategy.
Additional technical challenges include compliance and security standards, like HIPAA, FERPA, GDPR, and CCPA. Companies must adhere to these standards to avoid unethical data use.
Falling prey to these challenges results in what Brian Balfour describes as the Data Wheel of Death, shown below. It illustrates that data that isn’t consistently maintained becomes irrelevant or flawed, causing people to lose trust in the data and ultimately, use it less.
Solutions for becoming data-driven
So, how can well-established enterprises become data-driven like Netflix and Airbnb? They must rebuild around data. Here’s how.
Understand the landscape
Large organizations must understand the different silos that exist and how they communicate with each other. They must cultivate champions at the executive level who will support this journey. And they must communicate the data objectives with all stakeholders.
Identify your users
Next, to facilitate a successful data-driven transformation, you must understand your personas. Empathizing deeply with each persona—whether they’re a data consumer, leader, data analyst, or data scientist—will help you to understand how best to help them and leverage them in your mission.
Start small and keep it simple
Maksim says that to begin, it’s not possible to fix everything at once. It’s really important to be disciplined about focusing on impact. Pick a project that is visible and impactful across the organization and that doesn’t require a ton of stakeholders or coordination. Getting a series of easy wins to start with will catalyze buy-in and ensure the visibility of data projects. You’ll also be able to iterate on data work and leverage learnings for future data projects.
Align data strategy with business strategy
What does it mean to align data strategy with business strategy? Maksim’s practical advice is to choose data projects that expand and test your infrastructure, serve your business users, and align with company goals and results. Make sure every project is documented, reproducible, and follows best practices. Then, amplify your wins company-wide with executive support so you can build on those wins and keep iterating.
What being data-driven looks like
Being data-driven requires being able to measure the success of data initiatives. Maksim says it’s tempting to use easily accessible metrics like total number of projects, the number of completed datasets, or the number of repo followers.
“Those are all good metrics, but they don't really address the core of your success,” Maksim says. “I would think more about the number of decisions that get made with data. Also, how long it takes someone to find a set of data, or the time to get data. You should also look at how many goals, objectives, and OKRs are set and tracked with data, including how they’re reviewed.”
For more on making data operable within large organizations with complex processes, watch Maksim’s webinar on Operationalizing Data Within Large Organizations.
← Back to blog