Vishnu V Ram is the VP of Data Science and Engineering at Credit Karma. Vishnu has been with Credit Karma for nearly seven years and led the development of the company’s deep learning-based systems and its internally built machine learning platform that helps Credit Karma transform and manage data at scale.
Adel is a Data Science educator, speaker, and Evangelist at DataCamp where he has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.
Data science is a way to provide certainty: Credit Karma's recommendation engine is at the heart of its business models. With the use of data science, Credit Karma is able to increase financial inclusion by providing members a financial roadmap that takes them from where they are, to where they want to be.
Empathy is foundational to delivering data products that scale: Empathizing with members and the different backgrounds, credit scores, and their socio-economic status, was the key to scale to more than 100M+ members.
As you scale a data team, hire more specialists: As your data team matures, consider having a blend of generalist and specialist data scientists on your team. Moreover, provide upskilling pathways for your data scientists to specialize in one area of your tech stack.
Let's say you open up Google Maps and you are trying to get somewhere. The first thing you’ll do is provide a starting point, and a target destination to go to. The certainty of Google Maps leaves no room for confusion. For many Credit Karma members, the starting point is where they stand with respect to their credit history, credit report, and credit score. That's just such a powerful data point when you put it in front of an individual because you are no longer at the mercy of credit providers. You have a good sense of what you're allowed to do at that point in time, and with data science, we’re able to provide them with a roadmap to reach their destination.
It's really important to stay humble and curious and to understand how users experience the data products you develop. If your recommendation engine sends out an occasional bad recommendation, you might think that "Oh, of course, I'm doing this for 100 million users, you can expect me to get a few users here and there wrong." But if you lead with empathy with the user, you’ll be able to eliminate a lot of these wrong recommendations.
Adel Nehme: Hello. This is a Adel Nehme from DataCamp and welcome to DataFramed, a podcast covering all things data and its impact on organizations across the world. One of the most exciting aspects of data science and technology in general is that how it reduces the barrier to previously difficult to acquire information in services, whether Google maps, providing navigation, the ability to immediately watch a tutorial on YouTube or ordering a taxi via Uber. In that light, the spaces where data science can have the most impact in lowering the barrier to information and services are high stakes industry like finance and healthcare. This is why I'm excited to speak to Vishnu Ram, Vice President of Data Science and Engineering at Credit Karma. Credit karma is a FinTech startup founded in 2007 with the aim of providing financial inclusion to individuals by allowing them to see their credit scores for free.
Adel Nehme: Ever since, it has been producing a suite of products that leverage data science and that provide users more certainty around their financial future. Throughout the episode, Vishnu talks about his background, Credit Karma's mission, the inner workings of the data science products powering Credit Karma, how he led his data team through a phase of growth from 20 million to 120 million users, building data cultures, the skills data teams need to have and more. Now speaking of the skills data teams need to have, we're also happy to announce the launch of a 14 day free trial for DataCamp Professional. It's designed for teams of any size with... See more
Vishnu V Ram: Thanks a lot for having me here at Adel.
Adel Nehme: I'm excited to discuss with you your experience managing data science and engineering at Credit Karma, the best practices you've developed leading data teams and operationalizing data science at scale. But before we begin, can you give us a brief background about yourself and how you got into the data space?
Vishnu V Ram: The way I got into data was like I've heard a lot of your podcasts and I know a lot of people have also talked about this. It's a little bit like roundabout getting into data science. In my undergrad, I did a couple of projects which were related to using fuzzy logic for control of dynamics of systems. And that's how I kind of like got into AI. After my undergrad, I had an opportunity to actually do in neural networks, but I ended up choosing a different path. If I had chosen neural networks, I might have come into data science and come into the data world much earlier is the way I think about it. But my path ended up being more of a early stage startup engineer, built things from scratch and then moved into taking on a few early state startup CTO roles, ended up wearing multiple hats when you're doing early state startups as you can understand, or a bunch of different hats.
Vishnu V Ram: And I think in the beginning stages in these startups, it was more about leveraging data for analytics, leveraging data for making make or break business decisions. And then over a period of time started doing more, because these were more consumer tech startups, doing more of data pipelines for user behavior, understanding and making big product decisions. And then along the way I got into e-commerce product recommendations along the way in one of the startups. And then finally in Credit Karma, I feel like it's come full circle where I didn't join the company when it was in its early stage. But I feel like I've been involved in data across all of these aspects, analytics, data science, building recommendation systems, user behavior understanding, machine learning, all of the above.
Adel Nehme: That's awesome. And for those who are not aware as well, can you give a brief background about Credit Karma and how it works?
Vishnu V Ram: Yeah. I have to go back to say 2007 when our founders started the company. I think in 2007 when our founder started the company, it was primarily about making it free and open for everyone credit scores from TransUnion. That's how it started out. And then around 2015 or so current kind of the time around I joined so I'm going to bring that up as a big landmark year. It all was a landmark here because yeah, the Equifax as well as TransUnion. And then if you look at today where we are, we have done like 4 billion credit score and credit reports across US, UK and Canada.
Vishnu V Ram: So the journey for Credit Karma has been it started out as a credit score provider for our members and then it started helping our members understand their credit reports really well. But along the way, we've really added some really important features to our product like identity monitoring. And today our members use Credit Karma as a platform for shopping all their financial products, whether it's credit cards or personal loans or auto insurance or auto loans or mortgages. And over the last couple of years, we've also added free banking products like checking and saving accounts, which allows our members to really leverage their Credit Karma for everything that they need to do as far as their personal finances are concerned.
Adel Nehme: That's great. If I'm not mistaken, you have about 120 million customers, right?
Vishnu V Ram: That's right. And between 2014 and 2019, we added around 70 million users. And it's just amazing to see the growth that we have had. Part of the growth has just been like really, really amazing for me.
Adel Nehme: That must have been very exciting and really sets the stage for today's conversation. What's really exciting about Credit Karma's mission is really the use of data to democratize financial information and to equip everyone with the ability to make healthy financial decisions. So I want to set the stage for today's conversation by really trying to understand how central data is for Credit Karma success and operations. Do you mind giving us an overview of different ways data science is used at Credit Karma and some of the use cases you've worked on personally?
Vishnu V Ram: So probably take something that is like more relevant to all of us. Let's say you open up Google Maps and you are trying to see get to somewhere. The first thing that you actually will probably need to enter into the system if you do not have your GPS tracking turned on is where are you right now? What's your starting address? So for a lot of our members, the starting point really is where are they today in the eyes of all the financial institutions that they go to to get their financial products from, and not just financial institutions, if you want to get a home to rent, if you want to get the new Apple iPhone from Verizon or T-Mobile or whichever is your provider, you need to know where you stand. I think that's been the starting point for all our Credit Karma members as well to understand where they stand with respect to their credit history, credit report and credit score. That I would say the single data point of what is my credit score kind of is the genesis of the entire company.
Vishnu V Ram: So I would say that's just such a powerful data point when you put it in front of an individual, when they have a strong sense of then there's no more thinking about it, right? In terms of like, hey, do I have a good credit score? Do I have a bad credit score? Can I afford to get the latest iPhone? Can I afford to go and upgrade my rental? Can I offer to go get a new home loan? What home loan rate would I get? You are no longer at the mercy of any of these people. You have a good sense of what you're allowed to do at that point of time. Right? So I think that is kind of like the starting point for all of our members. And then along the way you get into using data for more automation, using data for more advanced use cases like leveraging machine learning to understand users better, understand the products better, understand the user product interactions better, and then you put it all together.
Vishnu V Ram: So to come down to like actual applications of data and machine learning in Credit Karma, so the way we think about things, the way things are split up is we have something called certainty. When a user is applying for a financial product, a lot of times they are putting themselves in a vulnerable position of potentially getting declined for that product. So there is a uncertainty in their mind whether they're going to get approved for it or not. So one of the biggest, biggest use cases that we have is providing certainty to our members. What are the chances that you're going to get approved for this product? Once you're able to provide that certainty to our members, it just provides them with like a complete relief. It's the kind of certainty that you know when you're leaving home and you want to go to your friend's place or you want to go to a new location. You just type in the destination address in your Google Maps or Apple Maps and you're going to get the information back, this is the route, follow this route.
Vishnu V Ram: You're going to reach that in 25 minutes. And more often than not, it doesn't matter whether you switch lanes along the way, whether you're going at 70 miles per hour or 75 miles per hour on the highway, you get that certainty in terms of when you're going to reach and that you're going to reach there. So that is a big application of data that we have within Credit Karma. And then certainty is great, but I kind of want to have control over what financial products I want to get. I might want to get financial product from A, B, C versus X, Y, Z. What is my own control over that? What is my own intent? Do I want to get a home loan, which is a 15 year, or do I want to pay my home loan over a 15 year timeframe?
Vishnu V Ram: Or do I want to pay a home loan over a 30 year timeframe? Those are things that I want control over. And it's something that me as a user, I want to understand. I want to be able to do that well. What is the data that the system has and the company has to help order things or rank things in a way where you are taking into account what the user wants and what they really want to get out of Credit Karma when they're using Credit Karma, that is where a ranking application comes from. And then the third thing is like, we all have needs where we want a credit card or a personal loan or auto or a home loan at various points of time. But the various points of time is the key phrase there. The timing of when you are getting something and when you are not looking to get something else is where our propensity models come into place.
Vishnu V Ram: And then when you put all of these things together is when you have an opportunity to provide really strong, relevant recommendations to our members. And when we provide those relevant recommendations to our members is when you are able to take all the things that the user has provided you in terms of information and data and be able to help them get what they want out of Credit Karma and out of all the other partners that Credit Karma works with. So that I would say is like our bread and butter in terms of what we data for. But apart from that, we also leverage data for doing important things like anomaly detection in our systems, anomaly detection in our business metrics. And then we also want to understand how well are we doing? How well are we going to do. Ask and answer like what if questions, which is where things like forecasting come in place. But I would say my bread and butter and a lot of the bread and butter for Credit Karma as far as data has been in the recommendation space.
Adel Nehme: That's absolutely fantastic in the way you frame it, especially when you mention the ability to provide relief for people. I don't think a lot of people consider the psychological stress, especially underprivileged folks experience when they try to interact with a bank or a financial institution. So the ability to provide a financial compass is so important.
Adel Nehme: Obviously recommendation engines are at the heart of Credit Karma's business model and you're one of the key people building this. So in a nutshell, and please correct me if I'm wrong, you provide distilled insights from people's credit data. They get provided recommendations of financial products from different banks and lenders based on their credit insights. And if they take that recommendation, Credit Karma gets paid by the lender. So the customer always gets free insights and the inclusion that you discussed. Do you mind going into detail over the first iteration of the recommendation engine and how has it evolved ever since?
Vishnu V Ram: Yeah, I think you got it more almost right there, Adel, so I'm just going to add some fine nuances here. So when I joined the company in 2014 and before, there were a few things that was already set and I don't think we've really changed any of that even now. One thing that was set was more of a win, win, win business model where we wanted to make sure that anytime Credit Karma gets any incentives, it gets incentives in a way when the user has benefited from the interaction with Credit Karma. And then on the other side, a bank or some other financial institution has also got benefit from the interaction of the member with Credit Karma. So we have this construct of win, win, win for our members, for us, as well as for our partners. We think that's how the ecosystem is really well self sustaining and also sets all of us up on a growth path, our members, us, as well as our partners.
Vishnu V Ram: But at the same time, the ways we have gone about it, the ways we have approached has changed over a period of time. When I started, I would say there were rules everywhere. There was limited data collection in terms of what the user wants. And there was a lot of rule-based decision making. And then I would say the first year or so, the job was to centralize the rules in one place.
Vishnu V Ram: Keep in mind that the company was doing really well. Our members were getting a lot of really strong benefits. The certainty models that I talked about were already in place in its earliest avatar. There was a lot of things that were working really well at the point of time when I joined the company. But there was an opportunity for us to look at an operation that was serving more than 20 million members at that point of time and to be able to make it far, far what was there.
Vishnu V Ram: So I would say the first couple of years, it was more about centralizing all the rules in one place, looking at models as the centerpiece and have an approach where we were able to replace a lot of these analytic insights, which were more point-in-time based, and might also be specific to certain populations and then replace them with models that could be evolving and learning.
Vishnu V Ram: And then to be able to support that, we had to make really important investments in our data collection infrastructure. And also, we made a big bet of moving on to Bitquery. At the point in time very, very early on when GCP was evolving and GCP was very early stage, we made a big bet of getting into Bitquery. I still remember a time when we were sitting in our CTO's room, where we were thinking about Bitquery was like other products in the market and thinking about hey, why we want to go into Bitquery.
Vishnu V Ram: I think Bitquery was a huge, huge win for us when we made that investment. What it allowed our data scientists was certainty, actually, in a way where they knew that when they come in on Monday morning, the data set that they need for them to operate on, to be able to start training their models, as well as the data that they need to look at to understand if the models that are currently running in production are working well or not, that it's there, they can access it and they don't need to worry about being able to access it.
Vishnu V Ram: I think, four years or so, back around 2017 or so, our models started to take over. We realized that we needed to make a major improvement in our experimentation system, which was just little archaic and was making a lot of weird decisions. So we wanted to make sure that we were able to build out a robust experimentation service. And that's something that we made an investment in, in 2016.
Vishnu V Ram: And that's the year we also started building out our early ML infra team based out in AWS initially, which we later put out into Google cloud.
Vishnu V Ram: I think at that point in time, we were also looking at whether we want to be a all Scala shop, because a lot of the engineering, or a lot of the systems were all based in Scala, and we were thinking, "Hey, there's more engineers than data scientists maybe, and data scientists write fewer lines of code. Maybe we can get our data scientists to learn Scala." Because some of them are coming from R, some of them coming from some other ecosystems. So we said, "Hey, why don't we just figure out if we can get them to do Scala." That didn't last long.
Vishnu V Ram: And I would say in 2018 is when we moved more towards Scala for engineering, Python for data science, and allow our data scientists to really bring in everything that they want, and also be part of the community. The entire data science community is based on Python, so allow them to also interact with the community, learn from the community and then bring it into a [inaudible 00:18:21] in easy fashion, while allowing our engineering systems to operate at scale on stack at Scala.
Vishnu V Ram: There's also the time where we had been following TensoFlow since it's .1 days. And we definitely thought that it was going to be something that we could leverage in some point of time in future. And 2018 is when we said, "Okay, let's start moving towards TensoFlow." And either in 2018 or so is when Google also published the TensoFlow extended paper. And then when we looked at the extended paper, we knew that these are exactly the way we were thinking about our own machine learning infrastructure. It made a lot of sense for us to go big TensoFlow, continue to go big on GCPS, especially given our early success with Bitquery.
Vishnu V Ram: And today I would say we do 35 billion model predictions a day. We collect all the data about what the user's seeing. The recommendation system drives everything that the user sees when they get a notification. It just has a hand in every interaction with the user.
Vishnu V Ram: Go back and reiterate the point that throughout all of these evolution, the constant has been the win/win/win business model. Our objective functions have not really changed. Our objective functions have stayed true to what the business really wants us to deliver.
Adel Nehme: I'd love to unpack some of this. So there were some infrastructure level improvements that were done, data collection improvements that were done, and more. In terms of the data being used to power this recommendation engine, obviously key partners are credit bureaus like Equifax and TransUnion. But do you also use external data to supplement your solution? If so, what type of data are you using to supplement the recommendation engine?
Vishnu V Ram: I think definitely the bureau data from our bureaus are the main driver for everything that we do. But along the way, we've also realized that while it's important for us to provide a lot of certainty to our members, and also to understand what they want, the application process of applying for a financial product and applying for different financial products is complex, and it requires to go through a lot of steps and long forms. And lot of that is also repetitive. There are a few data points that we realized were just required in all of these applications. And keep in mind that when someone finds it easy to apply for a product, and if they have certainty, then that's just going to provide that flywheel for our win/win/win.
Vishnu V Ram: So with that thought process, we have definitely brought in a few data points to help our members. And some of the data point is actually what the member themselves provide us. Just make sure that if you're asked for an income for applying for a particular product, then two weeks later you're going to apply for some other product, guess what, that same income is used in the next application flow also. It's just like, make it easy for the user to go through the process.
Vishnu V Ram: And then there are other things that might seem trivial, but from a data perspective, but from a member ease-of-use perspective, it just makes it so much better for us. And we all take this for granted nowadays, but you still need to do the work to make it happen. It's just like when I'm entering my address, am I getting my address right, have I screwed up something? And a lot of this, if you are just putting this in DoorDash or putting it on Uber, if you miss it, you miss a ride or you miss a food delivery. But if you do it in a financial product, you have the opportunity to waste a lot of time. You might be applying for a home loan. You might have put a bid and you want to get the home loan. You don't want to get this wrong, you want to get it right.
Vishnu V Ram: So to help members get things right almost all the time, we invest in some of these areas where it allows us to just correct the address that the users are entering, and that way they don't get it wrong. And it allows us to just provide that ease of use for our members.
Vishnu V Ram: And with a lot of our vendors as well, adopting cloud, and as being in the cloud as well, a lot of these data integrations have just become cheaper and easier. So it's a no-brainer to do some of these things to help our members get a much better ease of use as far as their experience is concerned.
Adel Nehme: That's great. And you've mentioned so far the win/win/win model and how you want to optimize for this. Now, there's a lot of angles by which we can approach this discussion, but given how central empowering users is for Credit Karma's mission, how do you ensure you're delivering recommendations and products that align with the key principles of the company, which is ultimately providing users with the ability to make better, healthier financial decisions?
Vishnu V Ram: I think that's a really hard problem. Great question. It's a really, really hard problem of being able to just consistently do this on an ongoing basis. And anytime you need to consistently do something that's really hard, you have to start with what do you do with your culture?
Vishnu V Ram: And I think as part of our culture, as an organization, as teams, as individuals within the organization, there is a lot of collaboration that we have. By definition, while we are doing things in data, we are collaborating really strongly with business to understand what the business needs are, to understand what the business constraints are better. We also collaborate very heavily with legal and marketing to understand what are the regulations that we need to be able to follow properly. And make sure that we are understanding our user needs better and understand what's working well and what's not working well.
Vishnu V Ram: Throughout all of this, it's also just having the right empathy for our end users, to really understand who our end users are and what they get out of using the product, and what feedback do they have to help us, that we can leverage to get better. Done a lot of different things here.
Vishnu V Ram: But I would say as far as culture is concerned, and what's really worked for us is just staying humble and curious through the entire process when you're talking to any of your stakeholders. Because especially when you're leveraging data to make a decision for 100 million people, you're working at very, very large scale.
Vishnu V Ram: So it's really important to stay humble in this process and curious in this process, and be able to get into a lot of these conversations where someone might ask you the question like, "Hey, why am I seeing this? Why are you sending this push notification to me? Why are you sending this email to me?" And the way they may you ask it, it might hurt you a little bit. You might think that, "Oh, of course I'm doing this for 100 million users, you can expect me to get a few users here and there wrong." But if you stay humble and curious, you want to get something out of that conversation. Just part of how you are able to inculcate a culture of just staying humble and curious and listen with empathy. That goes a long way.
Vishnu V Ram: And then to be able to support it, you really want to have the right processes, and you keep setting up the right processes along the way. Things like set up everything as an experiment, measure everything, make sure that your launches and ramps are well-designed, and you know that you have looked at these metrics before launch.
Vishnu V Ram: And even after launch, you are looking at the right metrics to make sure. There are times when we have said, "Hey, if it's revenue neutral, we want to launch." But if the revenue ends up being negative, then you're still being thoughtful about, "Hey, fine, the revenue is negative, but there are other important metrics. As far as users are concerned, it became better. So let's just go launch it."
Vishnu V Ram: And then as far as, again, you can talk about all of this, but there's change all around us. Especially in COVID times, we have seen all of this change come through and hit us in a big way. You have to build systems in a way where you are protecting the downside well, so that you are able to support innovation by the team all the time, to be able to keep up with the changes, to keep working with the changes, to get more value out of the changes. I would say to deal with a hard problem like that, you need to have strong investments and culture processing systems, is how I would put it.
Adel Nehme: And how do you make sure you're able to understand what the downside is? Do you employ any explainability techniques on your models currently, or do you do any retroactive analysis on how users have been impacted by a particular model, for example?
Vishnu V Ram: Great question there. I think starting 2017, when we ended up rebuilding our recommendation systems from scratch, we were very clear that we had impact on a process of out a very complex system. And the system has models, the system has rules, the system has other business constraints that might be applied outside of the recommender system. We might be running hundreds of experiments at the same time. So we knew that we really needed to invest again in data collection. And in this case, what we really did was we invested in data collection at scale of the inner workings of the system.
Vishnu V Ram: So what that allowed us to do was to be able to diagnose, troubleshoot if we see any anomalies or discrepancies, if someone reports a problem or if some segment of users were seeing something that they should not be seeing or not seeing something that they should be seeing. We really wanted to make that investment available for our analyst and for our data scientists.
Vishnu V Ram: So that, I would say, was a very, very big driving factor for us to be able to understand how the end-to-end system works because, when you build models, more often than not, you want to say, "Hey, I'm building certainty models. Is my models providing good certainty or bad? Are they doing a good job of providing certainty or not?" But you could have the most highly certain models, financial products, but if you never show it to the member, they're never going to apply for it or never going to interact with that and you don't know what's happening under the hood.
Vishnu V Ram: So we knew that we had different sets of models, different rules, different business constraints, as well as new data flowing in. So when you put all of that together, it's a very complex system. And you really, when you think about explainability, you really want to think about end-to-end system explainability, as far as the final outcome for our members are concerned, rather than looking at piecemeal.
Vishnu V Ram: The piecemeal is also something you get out of it, but you have to start from the member outcome and then work downwards. So the biggest investment that we made there was to just make sure that we are collecting a lot of data about how the system is working. And that allows us to then keep building on top of it, to be able to ... because explainability techniques, they also keep changing so that you have an opportunity to be able to keep building on top of what you already have.
How to create data solutions that add value to customers?
Adel Nehme: I think this is a great segue to discuss your overall experiences, being a data leader and keeping data teams on track at Credit Karma. I'd love to dive deeper into the ways data drives success at Credit Karma. And obviously, steering the overall direction of the data team is a key component of it. How do you ensure that, at any given time, you're always creating data solutions that are beneficial to Credit Karma customers and that add value?
Vishnu V Ram: I think it starts out by understanding the value that you're creating for the customers and the business. If you have a really good understanding of the value that you're creating, then you are able to do a good job. And the way I look at things is that you need a lot of different things to come together to create that value. You need really good data, high-quality data. You need to be able to collect all of that and scale, make that available in a timely manner for both model training, as well as analytics, as well as model scoring. Then you need to be able to bring the right modeling techniques to place to solve, which are appropriate for the problem.
Vishnu V Ram: Over the years, we've gone through various iterations. We still have a variety of different modeling techniques that we leverage, both neural networks, as well as trees and other models that we use. And then you need systems and processes. When you have data, the right data, the right modeling techniques, the right systems and processes, and all of that combines really well, that's when you create the best end user value. And if you're thinking about all of these aspects and you think about what are the gaps in each of these areas, and then you are in able to fill the right gap at the right time, then you know that you have the capability to consistently create that end user value.
Vishnu V Ram: And you're going to get it wrong and how do you catch when you're going to get it wrong? You just have to reflect on what went well and what can be improved because there have been times at which we have over-invested in data or we've over-invested in systems or we've over-invested in model techniques. All of these things, we've got it wrong. So the idea, really, is to make sure that you have all of these things coming together. And if you get one of these things wrong, guess what? You've probably not delivered the user value and you have a real opportunity of just completely killing that initiative.
Vishnu V Ram: So if you are being very thoughtful, if you reflect on these properly, then you know, "Hey, if I got my system right, then these models would've worked. This, we would've solved. We would've done a much better job of solving this problem for our users." So it's thinking about each of these areas on its own, and how they come together, and how do you reflect on it, and how you can improve on that. That's how you really consistently provide value for our customers.
Adel Nehme: So, of course, there's a lot of priorities that can compete here. How do you go about prioritizing different tasks and projects over time? Of course, expected ROI is important when prioritizing projects. And how do you measure that ROI and how does this feed into your prioritization framework, overall?
Vishnu V Ram: There really is no silver bullet to this. So I think you kind of want to do the things that you mentioned, for sure. The other thing that I feel that we have learned and I feel like we do a good job of it is just ... I'll just a data science term. Just be patient about it. So it's really valuable in having sound priors, in terms of which project is going to provide you with an ROI and which project is not going to provide you with an ROI. And again, some of it is also timing. So it's not just getting the ROI right, it's also getting the timing for the ROI right.
Vishnu V Ram: So one of the things that I feel that we've really done a good job of is ... Historically, Credit Karma has always done great job of attracting industry veterans into the company. And what can they help with? They can help us in coming up with sound priors. And then you are able to build out a set of iterations are cheap. So then, when you have sound priors and when your iterations are cheap, then the ROI decision making process, you can afford to get it wrong. Once you get it wrong, you run through a few iterations, the iterations are cheap, that allows us to be able to revert back and then take a different decision, go on a different path.
Vishnu V Ram: And some of this also comes from our ability to be able to build and assemble platforms that can help reduce the denominator of ROI, so it's cheaper to make some of these investments. And again, that's where you also have to have a good mix of build and buy. If you have a good mix of build versus buy, then you have an opportunity to reduce the investments. So your ROI decisions are not set in stone. It's cheaper to revert back.
Vishnu V Ram: Saying all of that, I did talk about some of the data collection that we have done in the past of how the systems have worked. It allows for us to be able to kind of ask interesting what-if questions, like, "What if I had this data? What if we were able to launch this model or build this model? Would we have been able to do a better job of making better recommendations?" So there is some science to it. There is also a lot of art to it.
Adel Nehme: Now, given that you've been at Credit Karma for about seven years now, I'm sure you've seen massive explosive growth of the company over that time. And what that means is also making sure that you scale the data and engineering teams, as well as the infrastructure that supports data analytics at scale. Going beyond that, though, how have you been able, as a data leader, to not only scale data science for the data scientists, but to scale the ability for everyone in the organization to make data-driven decisions at scale? What was the role of the data team in enabling that?
Vishnu V Ram: When you operate at 20 million users and then when you grow towards 120 million users, the team size naturally grows along with that. And when the team size grows along with that, you have to constantly keep evolving the processes, constantly have to be thinking about, "How is my organization structured? How is the organization structured for delivering what the future is going to ask out of Credit Karma and what the future is going to ask Credit Karma to deliver for the members?" rather than staying focused too much on historical context. And then you're also constantly evolving the infrastructure to be able to support all the data. And it's an ongoing challenge, each of these areas; how we evolve the process, how we evolve the org structure, how we evolve the infrastructure.
Vishnu V Ram: For example, when I joined in 2014, we had a few data scientists who were spread across few teams. There was no well-defined data science team, at that point of time. And then, until 2017, we didn't have a machine learning infrastructure team. So you have to be able to pick the time when you're going to create these new teams. Basically, you're asking a few people to stop doing whatever they're doing in different parts of the organization and force them to come together and become a cohesive unit and then you are creating a charter for them.
Vishnu V Ram: And then you also have to be thinking about, how is the business evolving over this timeframe. Our business has evolved in a way where we move towards more of a verticalized structure. And when you move towards a verticalized structure, each of these verticals have really important, strong goals that they're going after. And as a data organization, you are in a position where you can help each of them achieve their goals. So that, again, comes down to, how is our org and how are our systems evolving with the business needs?
Vishnu V Ram: Then you have to be able to think about, what do successful partnerships look like? You have to understand which are the partnerships where you need to be spending more time on to help them become more successful and which are the partnerships that you can afford to kind of let it go in its status quo, at least for a certain amount of time. And then sometimes you want to look at something and say, "Hey, that's a project," and sometimes you're going to look at something and say, "That's not a project. That's a long-standing team that needs to go through seven or eight different iterations, in terms of building the system to get it to a good place." And then, even after that, they've got to be able to keep adding more nuances and keep making things more complex and more effective.
Vishnu V Ram: And throughout all of this, how we communicate, how we prioritize, everything changes. We've probably changed our prioritization under planning and execution mechanisms multiple times over the years. And honestly speaking, there was a point of time I was focused very, very heavily on execution and there was less towards planning, especially when we were putting together the initial building blocks of our systems. It really required us to be able to execute strongly. But once you have the base systems in place, then you get into more of a rythym. You need to work more with the verticals to plan properly how they're going to leverage the systems to get more value out of the investments that you are made.
Vishnu V Ram: So I think it's like you are going to keep evolving your org structure and processes along the way so that you start getting really strong in execution and you're getting really strong in planning. And the planning process allows you to have these successful partnerships where you are helping multiple business units succeed by leveraging data.
Scaling the Data Team
Adel Nehme: Now, connecting back to scaling the data team, how have you made sure to evolve your data scientist skill sets? So in terms of direction, do you value more generalist data scientists or more specialists? And what are the ways in which you've made sure to evolve the data team's skillset?
Vishnu V Ram: I talked about the blooper of trying to get our data scientists to start doing [Scala 00:40:24] at one point of time. So I think it also goes back to what we want to deliver for the business and for our members to have a good sense of what are the current gaps, to have a good sense of how we can leverage our data really well, how we can model our business problems really well. When we started out, we never had the systems to be able to help our data scientists leverage something like deep learning. Along the way, once we made our investments in TensorFlow and GCP, then we were able to get into a place where our data scientists were able to look at deep learning as another tool that they added to their toolbox.
Vishnu V Ram: So I think a lot of these modeling techniques and being able to look at data, being able to explore data, a lot of these skillsets are ... I would say you keep adding it along the way as you face the right business problems. Sometimes people are going to go and do, like, "Hey, I want to do [inaudible 00:41:24] or I want to do GPT3," or whatever, but you might not really have the use case in the business to be able to leverage some of those things.
Vishnu V Ram: I think when we started out, we definitely needed a lot more generalist data scientists, especially when we were building out our systems. We need our data scientists to also get into the trenches with our engineers to make sure that our systems are designed well. Over a period of time, we moved into more of a world where we need our data scientists to understand, in a much deeper way, the problems that they're solving and the implications of some low level decisions that they may be making, which can change the way in which you're solving the problems. And then also try to understand how are these problems evolving?
Vishnu V Ram: So the partnership with more on the business side, the partnership with product understanding in specific areas, how we are doing things, I'll give you an example to kind of understand it better. Historically, credit cards and personal loans have been a predominant driver of revenue and transactions for the business and for our members. And there are a lot more credit card transactions that can happen than home loan transactions. So what that means is you have a lot of data there. Whereas when it comes to home loan, you have a lot less data there.
Vishnu V Ram: So to be able to build your skill sets, not just dealing with large scale credit card transact data, but also being able to deal with smaller, relatively smaller scale home loan, home purchase data becomes very, very valuable. So a data scientist needs to be able to understand what techniques that they're going to bring to play. They can't take the same technique and flunk it over here. They need to be able to understand what techniques will apply better here. And then they have to continuously keep understanding the problem better and try to simplify it so that they can, maybe some of their existing techniques can still work out here.
Adel Nehme: That's really awesome. And kind of expanding it into the organizational culture and skillset. How would you describe data literacy at Credit Karma and what are the steps you take to sustain it? And what are the systems that enable data access for everyone and the ability to make decisions for themselves?
Vishnu V Ram: Yeah. I think ensuring that when you're building out an organization, you are consciously deliberately thinking about how to make your team a diverse and inclusive team. I think being able to build a diverse and inclusive team allows you to be able to have a much broader understanding of some of the problems that our members face, and also really hit it out of the park in terms of how you can have successful partnerships with your stakeholders. Because believe it or not, you want to be working with a diverse and inclusive set of stakeholders. You really need to be able to engage with them. You really need to be able to understand their problems well. And we've had, I would say especially in the data science team side, we've had a really early success in hiring and developing key women leaders who have really helped us be very strong in how we operate here.
Vishnu V Ram: And that goes back to the data culture of setting up these key people, setting them up for ... hiring these key people, setting them up for ongoing growth. Growth is something that we hold really, really dearly. I can tell that you honestly, because this is the longest I've been in any company. And for me, I just feel like every year, every six months, every quarter, every month, I feel like there's always some new things that is going on that is helping me grow. And when you want to ... When growth is a big part of how you operate, then automatically what happens is you can't hold on to all the decision making, you can't hold onto all the prioritizations. You need to be able to allow your teams to also do bottoms up thinking in terms of what data is important for us to be able to solve our problems really well.
Vishnu V Ram: And when you go through that process, then the plan that you end up getting out of that process is like a much, much richer plan. And I won't say we are perfect at this. I still feel like I'm constantly finding ways in which we can improve that. And in terms of data literacy at Credit Karma, I think it's ... I mean, like when I told you. When I started, there were a lot of rules, rules all over the place. And the fact that there were rules all over the place meant that we were already looking at data and bringing these rules in and which were like more analytic driven insights where we said like, "Hey, this kind of a product works really well for people in this credit score range. That kind of a product really works for people in that credit score range." Based on data, there were a lot of analytic insights that we had captured in our rules.
Vishnu V Ram: I would say one of the biggest challenges that we have had is to be able to slowly kind of remove some of those rules and allow models to learn, allow models to evolve. And that's a really hard process because models can't do everything for you. Models are going to get some things really, really wrong. And rules are the protective mechanism that you've built into place to make sure that models and rules work together really well.
Vishnu V Ram: So if you want to take one of those rules, which has been such an important factor of delivering the right value to our members, and you want to replace that with models, what you really need to be able to do is to really, really invest in transparency of how the models work and make sure that all your stakeholders have a strong sense of how do these models work? When you iterate on these models, are you following the right process? Are you getting to a better place every single time you go out and do those iterations? And if you are not doing a good thing, if you are doing something which is kind of in a gray zone, do you come and have a conversation with us before you go and ramp these models up?
Vishnu V Ram: So I would say those have been the things that continues to evolve and continues to keep our lives interesting and challenging. But at the same time, we know that's what's really required for us to deliver what's best for our members and not just what's best for our models or the data driven system, but what's best for our members.
Adel Nehme: I think that's a fantastic take on data literacy, specifically in how it arms people to have difficult conversations about machine learning systems and production. I'd love to pivot to discuss a bit more around the future of data science and FinTech and what you're most excited about. What are some of the most exciting things that you're looking forward to in data science, and specifically, what are the most exciting things that you're looking forward to that will provide value for Credit Karma customers?
Vishnu V Ram: Yeah. I think our CEO and leadership have talked quite a bit about autonomous finance, where we are able to help our members do the right thing. Finally, that's what we are about, right? We want to make sure that our members can kind of leverage Credit Karma to be mostly on autopilot on most of the mundane, day-to-day pieces of their financial journey and be able to take the right decisions at the right time all the time. Right?
Vishnu V Ram: I think if you need to renew your auto insurance, if you just had a kid recently in the family, you might want to change what you have in the auto insurance. It's not just lowering your rates. It's also about making sure that you have the right protection for your family. To be able to use the data, to optimize our consumers lives in a way where it's rich and a lot of the mundane stuff gets taken care for them. And then, a lot of the important things, they're still kind of in control of their journey, in control of their destiny. I think that I would ... How do we go about solving those problems is probably going to take probably the next five years, as far as I'm concerned in the company.
Vishnu V Ram: And then, that's when you get into things like, hey, how can I leverage causality here? Because if you are not able to do things like causality, you're not going to be able to solve these problems well. How do I, and do we need to make investments in something like a knowledge graph to be able to understand what the users want out of certain financial products or what are their motivations, what are their goals? What do they get out of these financial products to solve their regular day-to-day things in their life?
Vishnu V Ram: So we need to make the right investments along the way in things like causality, things like knowledge graph, to be able to understand our users better, to be able to understand what value financial products provide. And then, that will allow us to get into more of, execute more on our vision for autonomous finance for our members.
Call to Action
Adel Nehme: That's really exciting and really awesome. So Vishnu, before we wrap up, do you have any final words before we finish today's episode?
Vishnu V Ram: Yeah. I think operating in the finance space and FinTech space, historically the space has been really opaque. But I would say there are a lot of innovations. There is a lot of transparency, a lot of new players who have definitely embracing transparency. And I would really want to make sure that all of us are doubling down on that. It's always easy to just take all this data and make more funny money for the business, but let's keep in mind that the data that we leverage actually belong to the users and belong to our members. And we really want to be able to make it more work for helping them succeed. And the business success would just naturally follow from that is what I would want to say.
Adel Nehme: Thank you so much for coming on DataFramed, Vishnu, and for sharing your insights.
Vishnu V Ram: Thanks a lot, Adel.
Adel Nehme: That's it for today's episode of DataFramed. Thanks for being with us. I really enjoyed Vishnu's insights on the data science powering Credit Karma. If you enjoyed this episode, make sure to leave a review on iTunes and we'll see you next time on DataFramed.
DataCamp Portfolio Challenge: Win $500 Publishing Your Best Work
10 Essential Python Skills All Data Scientists Should Master
Building Diverse Data Teams with Tracy Daniels, Head of Insights and Analytics at Truist
Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision Scientist
Chroma DB Tutorial: A Step-By-Step Guide
Textacy: An Introduction to Text Data Cleaning and Normalization in Python