The Infrastructure Supporting the Data Revolution with Saad Siddiqui, General Partner at Titanium Ventures
Saad Siddiqui is a venture capitalist for Titanium Ventures. Titanium focus on enterprise technology investments, particularly focusing on next generation enterprise infrastructure and applications. In his career, Saad has deployed over $100M in venture capital in over a dozen companies. In previous roles as a corporate development executive, he has executed M&A transactions valued at over $7 billion in aggregate. Prior to Titanium Ventures he was in corporate development at Informatica and was a member of Cisco's venture investing and acquisitions team covering cloud, big data and virtualization.

Richie helps individuals and organizations get better at using data and AI. He's been a data scientist since before it was called data science, and has written two books and created many DataCamp courses on the subject. He is a host of the DataFramed podcast, and runs DataCamp's webinar program.
Key Quotes
In some ways you do need to find the best solution for your data as it is today, but I think if you don't think about what's gonna happen the next five years from now, you're gonna be making platform shifts every couple of years that are really costly and just slows down the organization in a massive way.
All the data that was generated from the beginning of time till 2020 were generating in a year now. Right. So it is it is quite tremendous at how quickly we're collecting data across the enterprise, across functional units.
And it's not just the IT side. We're also seeing that sort of happening on the security side, where everyone's getting alerted on all these different vulnerabilities that they have. It's also happening on business performance as well, right? So like every time someone clicks on a button and how it interacts with an application, that informs product decisions and collecting all of that data. Those interactions are going to increase as the UI changes over time. What are the kind of questions that people are asking? What are the kinds of answers we're getting? How long is someone spending time with a video that is surfaced? That is being stored now more than ever before, you almost have to have an eye for scale.
Key Takeaways
Before embarking on any data infrastructure project, conduct a thorough audit of your existing data assets to understand where data resides, how it's being used, and what gaps or redundancies exist.
Successful data infrastructure projects require collaboration across various C-suite roles, including CIO, CFO, and CISO. Align their priorities to avoid conflicts and ensure smooth implementation.
Ensure your teams stay updated on the latest data and AI technologies. This continuous learning approach will help your organization adapt to rapidly evolving market conditions and stay competitive.
Transcript
Richie Cotton: Hi, Saad. Welcome to the show.
Saad Siddiqui: Thanks, Richie. It's good to be here.
Richie Cotton: Cool. So, just to begin with, what does data infrastructure mean?
Saad Siddiqui: So data infrastructure generally means collecting a lot of data and getting insights and Actions out of that data that you're collecting and during that entire process around get collecting all the data and getting insights and an actual plan to act on that data. Like what are the different components that you need?
To make that data as useful as possible. And this can cover. Going a little bit deeper, everything from how the data is transformed, how you secure that data, who has control around changing that data and trying to understand that change over time and then finding unique ways to get insights that can impact business outcome or life outcome that you may be working on.
Richie Cotton: Okay, that's kind of interesting that I guess in my mind I sort of thought, well, infrastructure means like all that really low level hardware stuff like, oh, something's happening in the cloud. But actually there's a lot more layers of infrastructure that goes right up to the, the software stack as well.
So there are there's a lot more depth to this. And I like that you talked about business impact. So maybe can you just talk me through like business problems that you're trying to solve when you are going about improving your data or infrastructure?
Saad Siddiqui... See more
So across. functional units and also across different that are getting impacted by the more recent transit data. So it kind of it touches every single organization within an enterprise and also different sectors as this ecosystem is evolving.
Richie Cotton: That's kind of cool. That is basically every different sort of a team. Like you mentioned marketing, you mentioned sales, you mentioned like, whatever industry you're in, then more data, better data infrastructure is going to be beneficial. I'd like to talk a bit about how you get started. So suppose your CEO says, we need better data infrastructure.
What do you do first?
Saad Siddiqui: I think it is really important to kind of get out. first understanding of where everything is sort of sits today. Right? That is one of the more important things because there's massive data sprawl within the organization. There's a lot of shadow I. T. People have are using some of the new technologies like open the eye and trying to understand, like, okay, where does all the data sit today?
And how are people using it? And what are the outcomes that the CEO and the management team is sort of looking for? Then you connect. Go down the list and kind of figure out, like, Okay, like, what are the platforms that the enterprise needs to bet on that can allow for those outcomes that can be built for scale?
So as the organization scales, they have better sort of understanding around the data usage, not just today, but Five years from now, but 10 years is hard because like the data doesn't scale linearly, it scales exponentially, and sometimes logarithmically. So I think just kind of like having an eye around scaling is really critical.
And then Securing that data set, right? So just making sure that the core IP that is within an enterprise and more and more, in my opinion, data is becoming the core IP for every enterprise across The entire enterprise sector just kind of make sure that the crown jewelers are protected on people know, like who has access to what and kind of protect against those beaches around either data getting leaked and there's a bunch of.
Issues that we've sort of seen in more recent years around that and data like a bad data getting infused into the data sets that can corrupt outcomes. And so the language learning model stuff that we're going to talk about today,
Richie Cotton: That sounds slightly terrifying. Like the idea that, well, okay, maybe your data is going to grow exponentially over 10 years. Do you need to plan for that when you start building stuff? Like if you've got to try and make a prediction about what's going to happen in 10 years or. Can you work with a shorter time scale to get started?
Saad Siddiqui: I think it is. It's important to kind of like making sure that you can deliver on those outcomes, but you almost have to kind of think through like your data is going to grow pretty substantially, especially now more than ever before, all the data that was generated from the beginning of time till 2020 were generating in a year now, it is. Quite tremendous and how quickly we're collecting data across the enterprise across functional units. And it's not just the I T side where they're getting pinged around alerts around application performance. We're also seeing that sort of happening on the security side where everyone's getting alerted on all these different vulnerabilities that they have.
It's also happening on business performance as well, right? So, every time someone clicks on a button and how it interacts with an application that informs product decisions and collecting all of that data and those interactions are going to increase as the UI sort of changes over time. What are the kind of questions that people are asking?
What are the kinds of answers we're getting? How long is someone spending time with a video that they're not And that sort of data that is being stored now more than ever before almost have to have an eye for scale and have an eye for what are the key components that you need to sort of like extract from that data.
So in some ways you do need to do the job today, but I think if you don't think about what's going to happen the next five years from now, you're going to be making platform shifts every couple of years that are really costly and just slows down the organization in a massive way.
Richie Cotton: Okay, so maybe you don't need to be able to predict exactly how much data you're going to have in 10 years, but you want to be aware that there's going to be some sort of limit to how far you can scale with whatever technology you choose. And you want to be mindful of like when that's going to happen, so you're not constantly switching.
Okay. So can you talk me through who needs to be involved in this? So, if you've got a data infrastructure project what sort of teams or roles are going to be involved?
Saad Siddiqui: I think in general, there's a bunch of teams that are now being involved that historically haven't been involved in the past. So traditionally, there's generally the CIO team the Chief Information Office like that sort of like controls all the data that is typically involved, right? Now more and more, we're seeing a lot of different business units having their own internal teams around data as well.
So the CFO may have data engineers on his or her team, the product teams have their own data teams. And so you almost as a central organization of the CIO, their job has gone from like, let's sort of like control all the data, is to empower all the business units to be able to get insights out of centralized data set.
So that is becoming more clear. And that sort of goes from product engineering to marketing to sales across the board. In addition to that, we're also seeing the CISO is more involved now more than ever before the security teams. And that's because they need to have a better insight around Who has access to these data sets, like understanding which accounts are becoming vulnerable in terms of like, who are the most critical people that have access to the most amount of data on the most critical data that becomes incredibly important as well.
So the CISOs are becoming more involved in decisioning as well. It doesn't matter in some ways if you can sort of, have the best data architecture. If the security team doesn't bless it, it becomes really hard to make a bet on that platform.
Richie Cotton: so actually you mentioned quite a lot of the C suite there, so you mentioned the Chief Information Officer is perhaps responsible for the IT side of things, you've got Chief Information Security Officer is going to be responsible for, I guess, data security, and then you mentioned Chief Financial Officer is going to want to weigh in on the cost of things, and then the product managers, and then that's Before we even get to like, Chief Data and Analytics Officer, Chief AI Officer, maybe even Chief Technology Officer.
So, it feels like a lot of the C3 could be involved in here. And that could get tricky. So, can you talk me through like, who needs to be accountable for which bits of this then, and how they go about doing that? Well, how you stop a C suite fight.
Saad Siddiqui: Yeah. 100%. I think that's sort of something that we're sort of seeing across the board where honestly enterprises are trying to figure this out because To your point, there's a lot of cooks in the kitchen. So making a bet on a specific platform that suffices everyone's needs becomes really, really hard.
in my mind, the most important sort of factor is like, and this is based off some of the conversations you've had with some of the guests in the past, which is where we've kind of gone from a centralized CI org in the world of business objects and some of the old legacy BI tools to The complete Ownership of data across the enterprise with companies like Tableau that sort of like had a lot of data across every business unit.
And in some ways, it's sort of coming back where centralized control is coming back because of a lot of the high profile security events that have had over the last couple of years and people need more control. So in some ways, the security and the data platform decisions are still being made by the CIO and CISO.
And over time, some of the end applications the decisions can be made by the functional units. So, the product teams may leverage the data set that the centralized teams have, but they may decide a different BI and analytics tool Compared to everyone else, and some of the other teams may not be involved in those decisions.
So as long as an enterprise has a really good understanding of data assets that they have, that is one of the most critical things. And on top of that, the teams can sort of enable self service uses to the other teams that need access to it.
Richie Cotton: That just seemed like an incredibly important point worth mentioning that it's a very good idea to understand what data assets you have before you start doing anything. Cause otherwise everything's gonna be a mess. Okay. So you talked about business impact and it's important for data to have some kind of impact on the business.
Now it feels like the further you get away from commercial teams and sort of the further down the technology stack you go, the further away you are from these obvious. Business metrics for success. So, how might you go about pitching a change to data infrastructure when it's not just like, Hey, here's a direct impact on revenue or costs.
Saad Siddiqui: I think that's a very important point. I think as an organization grows, people just get hired to do a specific job within an IT team or software engineering teams, and they don't understand necessarily the impact of the actions that they're taking. So I think it is really critical before any sort of decision is made around the company objectives, it is important that the entire team understands why they're taking the steps that they are, right?
So, for example, if you're moving off of a legacy platform onto Snowflake or Databricks, why is that an important thing? Is it on the cost saving side? Is it speed of execution? Is it getting better insights out of the product? Like, what is the reasoning behind that platform shift.
That's not an easy thing to do. So I think what we're sort of seeing is the best organizations are are actually connecting business outcomes and letting communicating with the entire team. Why those steps are really important. because like some of these decisions aren't easy, right, like when you sort of migrate an entire data architect from a legacy sort of platform into a newer platform, it takes months, sometimes even years to do, especially for the some of the largest enterprises.
So I think it is really It's important to understand for the entire team, why are they making the steps that they are and connecting them to those specific business outcomes. And generally, the best business outcomes that people get behind are around cost savings and revenue drivers. so they need to sort of see, like, hey, where are we taking these steps for X, Y, and Z reasons?
Richie Cotton: Okay. That does seem really important, but really can tie back any decisions to some kind of business metric, even if it's slightly more abstract than just purely monetary thing. I guess one thing I've seen is like any times a technology change, it's going to break some of my workflows.
And then. I'm sort of silently cursing whoever made the decision to do this, even though I know there's probably a good reason for it. can you talk me through how you go about bringing people on board with these technology changes? How do you pitch it to the rest of the organization?
Saad Siddiqui: The most important pitch ends up being, like, this kind of goes back to the initial point. Okay, like if If your CEO comes to you and asks you like, Hey, make this data infrastructure better, tying back to that sort of initial conversation we had, now that we have a better understanding of all the different data assets that we have, all the strengths and weaknesses in our platform, it could be speed, it could be cost, it could be a lot of different things.
And the changes that are being made as part of it, right? think sometimes. Engineering teams tend to kind of go down the rabbit hole of like, hey, this is the super cool new technology and we need to sort of migrate off of everything to let's say an iceberg based architecture, So if the CEO and the engineering team made a shift a bit ago, maybe two, three years ago on to a GCP or Azure based architecture, they're just going to question everything that you're doing, like we just made the shift a couple of years ago. Why do we need to do this again?
So unless it is clear That there's meaningful ROI around performance, driving revenue, around cost savings. It becomes really hard to get some of these decisions across. And in my mind, that's sort of like one of the most important things. Also, like to our conversation earlier, it is really critical to think about scale over time because data infrastructure moves pretty quickly.
And I suspect with the new language learning model architecture that's coming up, it's going to move even faster.
Richie Cotton: You mentioned the idea of like, moving from, say, a legacy data warehouse to Databricks or Snowflake or one of these kind of modern cloud platforms. How Would you quite often just changing the data warehouse isn't enough, you might want to say, okay, let's change our BI tools, let's change some of the other infrastructure around this.
Is it better to try and change everything in your technology stack at once, or would you recommend doing one piece at a time?
Saad Siddiqui: I think it really depends on the organization itself. And it depends on where the pain is being felt. Moving to different cloud platform allows you to scale much easily than some of the legacy systems, the pain is being felt around scaling, then that's sort of what you need to do.
And seeing, like, how BI tool sort of is reacting to the new influx of data volume. If it doesn't scale. Then you kind of need to sort of make a decision around, like, how do we change the interface by which data is being absorbed by the organization, So I think to be honest, it's I'm not giving you an answer here, but I think it really depends on the organization itself and sort of like, what are key priorities that they have and where can they get the biggest ROI around the data infrastructure?
Richie Cotton: Okay, that was artfully avoiding the answer to the question, I like that. Yeah, it depends on the organization. So I, I certainly can see how, yeah, you probably want to think about what your goals and then, you know, Pick the technology based on your goals rather than just making a decision outright. So I'd like to talk a little bit about timescales.
So you mentioned the idea, okay, if you just change your data warehouse two years ago, you're probably not gonna find much favor if you suggest changing it again. What sort of timescales are you looking out for overhauling data infrastructure? And like how often do you want to change things?
Saad Siddiqui: I would have told you, like, it is probably every three to five years, but as language learning models have become the number one priority for every organization, if you made a decision two years ago or a year ago, and your infrastructure doesn't allow for the utilization of some of these newer technologies, you probably need to move faster and make sure that your organization can leverage some of these newer technologies because your competition is not slowing down because you've made a decision just a year ago.
you almost sort of have to have an eye on the market and seeing what's happening in the broader ecosystem and make a decision that way. So in my mind, I think it is really critical to From a time scale perspective marrying that with the shifts that are sort of happening in the market because you can kind of make a shift in some ways now and then if that doesn't adhere to some of the innovation that's sort of coming out your competition is going to catch up to you much faster and may even start beating you in some ways.
Richie Cotton: Okay. Yeah. So, that just seemed like a tricky thing. So suddenly generative AI is everywhere and your infrastructure is not keeping up. how can you keep your organization or your infrastructure flexible enough to deal with changes in market conditions?
Saad Siddiqui: I think in some ways you almost have to you. reassess. This is such a big trend. This doesn't happen every day. This happens once every 20 years. So I think if your architecture today is not conducive for leveraging some of these language learning models, you have to sort of rethink your entire strategy.
And I personally think that Language learning models in general if you kind of think about it, like, sales people will be able to qualify sales needs significantly better than before. So, what is the value of that? you're actually narrowing in on a specific customer that has a need that you can sell to today, versus canvassing a hundred customers in the past.
In the case of customer success, we're seeing lot of companies that are looking at customer success, reducing costs in call centers and getting better results out of the call centers that they've ever done before. So in some ways you almost need to.
think about like, okay, this, magnitudal shift that's sort of happening if you don't kind of get on now, you are missing the boat in a meaningful way. So how do you sort of like, advantage or make sure that you're sort of ahead of these, these sort of shifts? I think it's honestly tied to continuous education and just kind of making sure that you're not falling behind compared to your competition.
Because this is one of those things that like happens once every. 20 years or so.
Richie Cotton: So that's good that you're not gonna have to react to some crazy change in the market conditions every quarter or something. It's like, it's a bit less frequent than that. you have any tips around project management or processes that are going to enable you more reactive?
Saad Siddiqui: I think one of the most important things in my mind is kind of making sure that you're collecting as much data as possible, but not necessarily like, noisy data the data quality. piece of it is going to be really critical and collecting the best quality of data and then tying it to language learning models is going to become really important in the next generation as well.
I think it's going to start informing your language learning models in a more meaningful way. And it is also going to make sure that you're able to service your customers significantly better than before. So data quality becomes really critical trying to understand, like, what is the data set that you're actually collecting to train these language learning models.
And tuning those models to the next generation of, problems that are sort of coming up is going to become really, really important as well for the next generation. For the next couple of years. I think
Richie Cotton: Related to that, are there any skills that you think you ought to have in house in order to be able to work with better data infrastructure or create better data infrastructure?
Saad Siddiqui: I kind of go back to the continuous education side of things, right? So I think it is really important to kind of be on top of it. in the past, you would sort of assume that if you had a strong data team within an organization, that should be good enough. In my opinion, that's not good enough anymore.
you're having impact of data is being felt throughout the entire organization. And so you almost have to have your marketing team, your sales teams, your, our sales operations teams, your product teams, or product operations teams get exposed to the next generation technologies in some ways, so that they're not very siloed, they can sort of see the impact of the new innovation that's coming up, but also how can they leverage data in a more meaningful way to impact their functions, basically.
Right.
Richie Cotton: Okay, that's wonderful. I love that you sort of mentioned continuing education, there's a very dear to our hearts at Datacamp, and the idea that sales, marketing people, like all these commercial teams, they ought to have some exposure to data and all these sort of technologies around them. do you have any examples of this?
Like, where you've seen commercial people have that level of data literacy and they've had some sort of benefit?
Saad Siddiqui: Yeah, we're seeing that actually in a bunch of our portfolio. A lot of our portfolio companies are sending their they have educational programs for their product teams and their finance organizations as well, where they're making sure that they're across some of these newer innovations.
And what we're also seeing is There are a lot of newer applications that are coming up that are reinventing some of these categories. We're seeing organizations that are coming up in FP& A. We're seeing sales and marketing has a ton of companies that are coming up. So the education is also in some ways happening by where the vendors and startups are sort of going.
So in some ways, you almost have to make sure that you're across the newer sort of technology companies that are coming up, you don't need to make a buying decision now, to be honest, and to be honest, that might actually help people like me. But I think folks need to be across some of these newer technologies, either in an educational fashion, or be exposed to the startup ecosystem.
And even if you think start tool as a toy. It is really important to understand why the founders have conviction around this idea. why did it feel that it's going to have an impact in the use case that they're sort of focused on?
Richie Cotton: I'd like to know a bit about some trends that happening within data infrastructure. So just at a high level, what do you think are the most important trends at the moment?
Saad Siddiqui: I think there's a few trends that are kind of really interesting us here. The first one is around data security. I think as more and more organizations are leveraging language learning models, we're seeing that data security is becoming really critical for that, So, people are exposing really confidential data to companies like Chad GPT and having control over that within their organization is really critical.
So we're seeing a lot of startup activity in that ecosystem around making sure that your private sense of data isn't exposed to a third party and where you may not have control. The second thing that we're sort of seeing is data quality and transformation and some of those capabilities the middleware of data is becoming incredibly critical.
Because now that you have your data secured we're seeing a lot of issues around hallucination people making decisions on outcomes that may not be correct as the models are hallucinating. for having me. So one of the biggest problems around that is around data transformation and observability and making sure that everything that you're sort of collecting and training your models on is as close to real as it possibly can be.
And the models aren't being trained a data set that may change the outcomes that you're sort of looking for. And in some ways, ties around data quality and lineage and security is access control, So we're also seeing a lot of startup activities sort of like capping around, like, who has the right access to the data sets that are critical to the company.
And as you're changing those pipelines it becomes really important to kind of making sure that, there's no file data that is entered into that data set. So, yeah, I think that's sort of like some of the more important trends that we're sort of seeing in the broader data ecosystem.
Richie Cotton: Okay, so I guess there's three trends really. So bigger focus on data security, bigger focus on data manipulation and transformation, and then also around data quality. And I suppose governance is the broader idea around that. Interesting. So of those, which would you start with? Like if you want to try and improve your data infrastructure, which of those three things would you care about first?
Saad Siddiqui: I think it depends on the outcomes you're looking for In some ways, I think it would be naive to believe that you're employees aren't using some of these technologies. So I would probably start with the security piece of things.
and then also trying to make sure that once you sort of figure that out then you move on to data lineage and quality. I think if all of these are incredibly important the quality and lineage Impacts outcomes in a more meaningful way and data security.
Make sure that you're not on the front page of the Wall Street Journal because there was a leak because of an employee mishap.
Richie Cotton: All those three things are actually, they're slightly removed from those sort of business impact things. So I'm just wondering, are there any ways to change your data infrastructure that's going to have a direct impact on the customer experience?
Saad Siddiqui: I think there's a bunch of really interesting companies that are coming out that have an impact on specific customer related use cases, right? So around product analytics there's a bunch of companies coming out that are are gonna have an impact around, sales and customer success as well. So if you can build a product and this is maybe for the, the startup.
Ecosystem here. If you can kind of build a product that leverages language learning models, and you're able to collect a data set around a specific use case, right? Let's say sales performance And you can kind of use that to forecast your sales in a more meaningful way. There's companies like Gong and some of the more established guys that are basically using some of these technologies to inform their users on what's the best way to deliver outcomes and what's the best way to deliver like, compete against their competitors.
Something like that is really critical, So yeah, I think we're seeing a lot of like really interesting technologies coming up for specific use cases that sort of touch different customers as well.
Richie Cotton: I really like the idea of just basically using data in order to compete better. And so it's, that's a kind of, I guess, offensive use of data. And the flip side to that is, is defensive use, which is about dealing with well, complying with regulations and Yeah. I think particularly in a lot of, regulated industries, we're talking about finance, we're talking about healthcare there are so many regulations you have to deal with around data.
Can you talk me through how better data infrastructure can help there?
Saad Siddiqui: What we're sort of seeing is like there's a bunch of companies that came up almost a decade ago around process automation, these are companies like UiPath, Automation Anywhere, a bunch of these RPA tools. That's an example of a use case, and a lot of their early customers were customers in the insurance space, and they were looking to improve their operations, whereby customer, for example, uploads their documents for an insurance claim, and kind of going from initial information to claim reimbursement.
There's all these different pieces where an organization is collecting data and making decisions on that data. Stuff like that gets supercharged with some of these language learning models, right? So, now you can kind of figure out exactly what part of the organization or what part of that process is the big bottleneck, and what is the part of the process that costs the organization the most amount of money.
And you can build a plan, you get visibility on the entire process and you can now build a plan around how to mitigate some of those bottlenecks and cost overruns in specific parts of the process as well. So that's an example of like how AI is impacting, let's say the financial services industry.
We're seeing similar things sort of happen in health care. We're seeing similar things happening in a bunch of different verticals. Even in construction technology, right? So, in construction tech, we've made an investment in that space where we're seeing let's say you have an H back outage.
How do you Understand what are the things that would have done to the HVAC before this now you can actually take a picture of that HVAC machine. You can actually get insights around, like, what the last maintenance things that were done? You can understand exactly what are the things to do to fix that HVAC machine or types are in common.
You can Basically, get a better understanding in real time using computer vision and AI, and then it's connected to the back end around what is the cost of these operations. So like you need this sort of pipe, this sort of the cog that you need to fix that HVAC machine are this some of these pipes and get insights or get an outcome significantly faster than you would have in the past where you wait for someone to come for over a week and then they tell you like, Hey, we're going to be back next week to try to fix this thing.
You're able to see some of these impacts happen in a much faster way. Thank you.
Richie Cotton: Okay, I like the idea that you're getting the impact on the customers by being able to diagnose problems quicker and then avoid some of the, admin work of trying to get stuff fixed. So yeah that's going to speed things up and, make everyone happy. So on that note, it sounds like, a lot of the use case here is around automating thing.
So. Have you seen any other examples where automation has been possible through better infrastructure?
Saad Siddiqui: Are there areas that you'd like to kind of cover? Because in some ways, like everything is being reinvented.
Richie Cotton: Okay. I love that. It's just everything is being automated. That's kind of brilliant. Okay. Yeah, I don't know whether there's any specific examples you've seen where it's just like, there's been a clunky process that you've seen automated.
Saad Siddiqui: The area that people are sort of kind of going to the most is around the RPA use case, and I'm happy to describe it again, Rich, if you'd like.
Richie Cotton: Maybe let's talk a little bit about venture capital. So since you're an investor, can you just talk me through how you make use of data during the investment process?
Saad Siddiqui: This is an area that we've made a massive amount of investment in over the last years or so. And I think the way we sort of think about data within our practice is the identification of interesting sharps on the front end. And these are based off of 60 plus signals that we track everything from web traffic to employee growth to funding, a bunch of different metrics that we track there.
In addition to that, we also try to understand we have a specific platform in our platform technology. We have a concept of thematic search where you can type in. Data transformation and it will pop out a list of all the data transformation companies and then also rank all those companies as well Around who's doing well and who's not doing super well based off of these signals that we have And then once we've made an investment, we have benchmarking tools that are again over 80 different metrics that show the health of the company across the entire organization.
Everything from revenue growth how efficient a company is around cash burn, around all these different facets that are critical for the business. And as these companies scale and look for more funding. Trying to understand what are the things that the company needs to fix over time. And we sort of built this benchmarking tool as well that does that.
So some ways we've got everything from initial identification to like helping our companies operate better. The entire process we're using modern for.
Richie Cotton: Okay. That's cool. The data is being used like throughout the whole investment process. And I like the idea of using data to be able to rank companies and see how well they're doing. Actually, the other thing you mentioned was about using data in order to help companies operationally. Are you able to share any of the details on what sort of help you might be able to provide and what the sort of data indications are that the company needs help in some area?
Saad Siddiqui: So I think the one of the most important things is like to first identify. So when you're operating company as a founder, you're spending a lot of time in your own world, right? So you're like, Hey, this organization may be rough. This organization has room for improvement. As VCs, we're talking to over 1000 companies a year, And What we are able to do is help the founders and the management team understand what best in class looks like. And even within our portfolio, understanding who are the best companies. So let's, for example, let's say a company that we're working with has, a customer churn issue. They do a really good job in terms of selling their product, but over time, the customers don't stay with them very for too long.
based off that, we can identify companies that have had the most improvement over the last 12 to 18 months or have had the highest customer retention rates in our portfolio and connect those organizations together and be like, Hey, do you have 30 minutes of your time? We're just trying to understand this problem and fix this.
and we're sort of collecting a lot of that data on our side and building data set that can inform those companies in a more meaningful way. So it's not just like, advice that we're telling our employees, Hey, you need to, be growing at 100 percent or 120%.
It's not that. it's like actual insight that we're collecting and connecting our portfolio companies together based on the challenges that they're facing.
Richie Cotton: So, before we wrap up can you tell me, are there any particular companies that you're most excited about right now?
Saad Siddiqui: Yeah, there's a few companies. So we've invested in a company called Encorda. it's one of the most interesting data query platforms that are sort of out there. They've got some of largest Fortune 10, Fortune 50 companies as customers, and they're helping them on everything from supply chain, FP& A, and a bunch of different parts of the organization.
and getting insights faster than any other tool that you can find out on the market. And we also recently made an investment in a company called Coalesce, which is a data transformation tool Snowflake. So if you're a Snowflake customer, you probably know of Coalesce. And they have hundreds of customers using the platform today.
And using some of their metadata technology they're able to Manipulate their data significantly better than any other tool out in the market. So yeah, those are two companies that we're super excited about and doing super well.
Richie Cotton: definitely seen Coalesce around at a few conferences, a lot of advertising going on there. So that's interesting. All right. So just finally, do you have any advice for any companies who are wanting to make better use of data?
Saad Siddiqui: Yeah, I think at this moment in time, it is very important to just keep your eyes and ears open. the technology stack is changing by the day, by the minute, by the hour. it is really important to make sure that you're building a flexible.
process and infrastructure that can sort of scale better and leverage some of these newer technologies in a more meaningful way. You don't want to locked into a solution that may be outdated in a couple of years.
Richie Cotton: Okay. Yeah. It does seem tricky. Like the technology platforms, they're always moving. So you've got to building keep doing things to keep up. All right. Super. much for your time, Saad.
Saad Siddiqui: Thanks so much, Richie.
podcast
Towards Self-Service Data Engineering with Taylor Brown, Co-Founder and COO at Fivetran
podcast
Optimizing Cloud Data Warehouses with Salim Syed, VP, Head of Engineering at Capital One Software
podcast
Data & AI Trends in 2024, with Tom Tunguz, General Partner at Theory Ventures
podcast
Scaling Data Engineering in Retail with Mo Sabah, SVP of Engineering & Data at Thrive Market
podcast
[AI and the Modern Data Stack] How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks
podcast