Creating an AI-First Data Team with Bilal Zia, Head of Data Science & Analytics at DuoLingo

Richie and Bilal explore rebuilding an underperforming data team, fostering trust with leadership, embedding data scientists within product teams, leveraging AI for productivity, the future of synthetic A/B testing, and much more.

Nov 24, 2025

Guest

Bilal Zia

Bilal Zia is currently the Head of Data Science & Analytics at Duolingo, an EdTech company whose mission is to develop the best education in the world and make it universally available. Previously, he spent two years helping to build and lead an interdisciplinary Central Science team at Amazon, comprising economists, data and applied scientists, survey specialists, user researchers, and engineers. Before that, he spent fifteen years in the Research Department of the World Bank in Washington, D.C., pursuing an applied academic career. He holds a Ph.D. in Economics from the Massachusetts Institute of Technology, and his interests span economics, data science, machine learning/AI, psychology, and user research.

Host

Richie Cotton

Key Quotes

The biggest success I had as a leader was investing in the people, not in ideas. That order is really important. Ideas don't become successful if you don't have good people in the team.

Finding that sweet spot between being able to communicate technical expertise and technical concepts in a simple intuitive way is probably the most valued skill at Duolingo.

Key Takeaways

Focus on building trust with leadership by aligning data science efforts with the company's biggest challenges, ensuring that data initiatives are directly addressing key business problems.

Adopt a hub and spoke model for data teams, embedding data scientists within product teams while maintaining centralized reporting to encourage cross-pillar collaboration and innovation.

Enhance data science productivity by leveraging AI to automate repetitive tasks, such as anomaly detection and database querying, freeing up time for more strategic work.

Links From The Show

Duolingo

Duolingo Blog: How machine learning supercharged our revenue by millions of dollars

AI-Native Course: Intro to AI for Work

Transcript

Richie Cotton: Hi Bilal ,welcome to the show.

Bilal Zia: Hi, Richie. Nice to see you. Happy to be here.

Richie Cotton: Yeah, great to speak to you. Now to begin with, I know when you joined Duolingo you told me you inherited an underperforming data team. So I'm curious to begin, like what had gone wrong?

Bilal Zia: Yeah, great question. So typically when a new leader joins and inherits a team, they start, they can either start at the ground level or they can build up a team from scratch.

What I think, I like to joke that what I inherited was not the ground level, but the subbasement. And the reason I say that is because there were a few data scientists at Duolingo when I joined. But there was no head of data science, so the team was a little bit rudderless. It's not like the leadership wasn't aware.

They were aware of this deficit and they had been looking for a head of data science for a while. Obviously their standards are very high and they were looking for a very specific type of person, and they were just looking and looking. What ended up happening was that during this time there was no advocate for these data scientists at the leadership level.

And hence, there was a big wedge between what the data scientists themselves believed they were working on and the contribution they were making, and what leadership believed that they were contributing. As a result. There was basically this lack of trust in what data scientists did and also a la... See more

ck of motivation among the data scientists themselves who didn't feel like their work was being appreciated very well.

So in that sense, building up that trust. With leadership, giving the team the motivation were priorities, number one for me when I joined and I had to get up to the ground level first before building the team up from there. Happy to go into sort of more details of that, but that's the high level what I inherited.

Richie Cotton: That seems like a common problem when you have data scientist and think, oh, I'm working really hard. And then the manager's what are the data scientists actually doing? There's often this communication goal, and you mentioned that solving the trust problem was the first thing you had to do.

Can you talk me through how did you get started solving trust?

Bilal Zia: Yeah, so when I joined there were essentially a need to understand what are the biggest data related problems that the company is facing, or what are some sort of big problems that company leaders are facing that data can help solve. So it wasn't even.

At times there wasn't even an option that data science or data scientists or eco economists can help solve a certain problem. So the first thing I did was to actually just gen generically, go to all business leaders, all possible stakeholders, from finance to the leads of growth, to monetization, to language learning et cetera.

Just to understand the biggest problems that they're facing, understand the constraints they're facing, and then try and match. The types of work that we can do to help solve those problems. So that matching exercise was priority number one for me. And that alone did two things. One, it actually helped me direct resources to the right places.

At least I tried. At least the probability of success was higher because I wasn't just doing something that nobody cared about. And the second thing is that it gave the leaders I was talking to, something to appreciate that, look, there's this new leader. He's not just coming in and doing whatever he wants.

He's actually listening to us. This is a bottoms up approach and it's a collaborative approach and. I think both of those things were super important and some of the things we did initially ended up being quite successful. That built the trust battery with leaders and then ultimately we were able to do some big swings as well.

And most of them were also successful because they were all informed by what are the biggest constraints and what is the likelihood of success, what is the investment needed? So ultimately, I think it was, it, there's an a magic sort of bullet. To how to do this, but it was basically just listening to the stakeholders and bringing them along was, I think, the key ingredient to to building trust.

Richie Cotton: It sounds so simple when you say out loud, I just talk to you. Just talk to all the people who had business problems, found out what they were, found out how data could help them. So it's just matching up like. What's needed and what can you actually do to help? Fantastic stuff. So we're gonna get into a lot of the details of how you improve the team, I think, but I wanna skip the good bit.

Just for a bit of motivation. Can you talk me through some success stories you've had? Yeah, for sure. I think the first thing that I will say is. The biggest success I had was investing in the people not in ideas because I think that order is really important. Ideas don't become successful if you don't have good people in the team.

Bilal Zia: As I mentioned, I inherited a smattering of data scientists. They were reporting into different parts of the engineering org, and their managers are great, they're great engineers, but they're not data science managers and they're not coordinated among each other. So it's it was acknowledge that this is a stopgap measure. So the first thing was to bring everybody under one umbrella and then to build some team structure. And what that means is building some level of management layers so that junior data scientists have a reporting layer. They have some people to look up to, they have mentors, they have people they can brainstorm with et cetera.

So identifying who are the right folks to be leaders within the org. Moving people around, letting a few people go because I think people get demotivated by people who are not performing well that actually does have a negative externality to everyone else. So maintaining a high quality bar hiring people where I where there was a gap once this initial internal audit was completed and then building a leadership layer.

So ultimately where we've ended up is that although I lead the data science org under me, I have people, leaders who. Every single vertical that we work in, and then the org grows under them. So that's how the tree is gonna grow. And it's scalable and sustainable and each of those leaders individually are fully invested in their particular pillar, and then they talk to each other as well.

So it's that system, I believe is much, much better because it gives the blueprint for from where impact can actually materialize. So I think the biggest thing I did where I think I'm the proudest of is that building that structure from which impact can be generated specifically on to give you an example of where we had the impact.

I think the example I've shared in other forms, but before I can repeat here, is the first thing we did was to pick something that. Where the likelihood of success was high. So we have we're a public company. We put out a user forecast so we have daily active users on a platform. Every quarter, we have to forecast what the growth is gonna be for the next quarter and for the next year.

Underlying this user forecast is a scientific model. It's like a forecasting model. When I joined, there was a model, but there was a lot of discretion that was added on top of this model, partly because there wasn't a lot of trust in the science. And because of this discretion. The forecast was measurably off actuals, so it was about a % discrepancy between what we forecast, where we would be and where we ended up.

And as you can imagine, the CEO didn't really have a lot of trust in this forecast. They would be jokes in, in, in their meetings that, oh, this is the forecast. Nobody really cares. So the first step was from my perspective, this is a science. Problem. We can solve this with just doing better science.

And so we actually worked really hard to dial up the rigor on the scientific model that generates the forecast. We did back simulations on data from past quarters to see what the results would have been had We applied this methodology and we showed all the leadership in the, in, in the company, those results which was that we would've been very close to actuals.

That sort of built an initial trust battery that, okay, let's try this. And then since then we have been very accurate in our reporting. We have been with within zero to % accuracy for the last. Four or five quarters. The last quarter was a little bit an oddity. We had some unexpected shocks, but prior to that we've been between zero to %, which is the sort of gold standard of accuracy.

The result of that has been that now all the folks who, contributed to changing the scientific model and adding the discretion. They're doing other things. They have other things to do. They're the chief of product, they're the chief engineering officer, they're the chief technology officer.

They're, they have other things to do that are more important than tinkering with the forecast, and they're busy doing that. They trust the forecast. I think that's been a really big win.

Richie Cotton: That's pretty amazing. And I have to say you can tinker around with all the other bits of your business, but if the data science isn't right, you forecast off, then yeah, nothing else is gonna work.

So I love that he's focused on getting the fundamental analytics right. And then that's the easiest way to make to build trust is get the right answer. Okay. Beyond that you mentioned that the organizational structure is something you're very proud of and I feel like.

In a lot of organizations, they're pushing data analysts and data scientists to be more embedded within within business units. But you said actually you wanted to bring everyone together. All the data scientists could communicate with each other and have mentors and things like that.

Do you wanna talk me through some of your decisions around the organizational structure there?

Bilal Zia: Yeah, I think I, I see advantages of both of those models. So what we have is. Is a hub and spoke model. So essentially the data scientists are embedded in individual product teams across the company. So the company itself is divided up into key pillars.

So there's the growth pillar that focuses on user growth. There's the monetization pillar that focuses on making money through subscription ads and in-app purchases. And then there's the learning pillar that focuses on our content, on making, on teaching better. And within each of these pillars are several teams that are focused on individual parts of of the mandate.

And those teams are typically staffed by engineers. Product managers, designers, and now data scientists are embedded in that team structure as well. And I think that is super important for me because without context of what the product teams are thinking, what their direction is what their successes are we can't be successful.

We can't be successful as a sort of. Ivory tower team, think sitting in a corner doing our own thing, and then nobody will trust that. Nobody will think we're thought partners. So from my perspective, being embedded in the teams being included in the team ourselves, thinking that we are part of the team is really important.

And that has been a key part of how we have been successful and how we've built trust. At the same time, I feel like the hub part of the hub and spoke is really important as well, which is there needs to be centralized reporting within data science because yes, you are a really good thought partner to product engineering design but you also need folks with the same technical know-how as you across pillars to understand how the other pillars.

Ideas and constraints interact with the work that you are doing because then you can come up with better solutions. And at the same time, you can brainstorm ideas. So for example, one of the things that our team has pioneered over the last several years is the use of machine learning in product.

This started off in monetization where we actually developed a model. To predict the likelihood of user subscribing to our product. And that model was very successful monetization. And we have since migrated that same mentality or that same model structure over to user growth where we're now be, we're now, we now have models that predict the likelihood of a user churning so that we can act before they turn and try and convince them to stay.

So that's an example of cross pillar pollinization that's happened because we have a. Hub structure. And I feel that's really important. It's also important for culture, it's important for team spirit. We actually are a really fun bunch. We are, we're not just colleagues, we're friends. We make fun of each other.

Our Slack channel is super active. We have a meme wall in our New York City office. We just make fun of ourselves basically. So I feel that culture aspect is not really possible if you're s. Smattered across different small teams. I feel like a central home is important where you come to be with your people and then then you go be with your other types of your people, which is like the teams that you're embedded in.

So I feel like that dual structure is quite important.

Richie Cotton: Absolutely. Yeah. Certainly I, I see how if you need the domain knowledge, you want to really be solving business problems and you've gotta be close to those commercial teams, but. Also, yeah, you don't want every data scientist to be inventing stuff on their own.

You do need to share with other data scientists and Yeah, it's nerd, nerd out a little bit together. You mentioned that this is happy ending with all the team being friends. It's you, but earlier you said when you took over you had to get rid of some people, and I'm sure it's like very tough management decisions, but.

Can you talk me through like how you decide, like how, who do you keep, who do you get rid of? Who could, who needs to be worked on? What makes what are the criteria for the people you wanna keep?

Bilal Zia: Yeah. I don't think there is just hard data on this. A lot of this is subjective.

A lot of this is based on feedback from stakeholders and also just talking to the data scientists themselves. So for example, one so I'll come to the specific question about who to keep and who to let go in a minute. But I think the, even before that, what was important is how to. Decide whether someone is capable of managing.

Typically in engineering and even in data science, I think there is a tendency that management is just a route to seniority, that everybody's in ic and then at some point in your career you have to start managing because that's the only way to progress in your career. So I'm a complete disbeliever in that methodology, I think ices can grow.

To very senior levels and remain ics if that's what they're passionate about. And management should not be seen as a I must do this. You must be passionate about leading people and being a good mentor if you want to do that. So identifying those types of people. So for example, one of the people leaders that I have on my team, she leads our user growth and forecasting and BI teams.

Even when I joined, she was actually just an ic. I think she was managing one person and that was also a stopgap. But she was informally mentoring a bunch of people. People would just go to her and say, Hey, I need help with this. And she would make time, sit down with them, get them through the problem, et cetera.

So to me that was already I think she's ready, this is what she wants to do, and this is where she's good at. And now she's thriving. She is she's one of the best managers we have at Duolingo and not just at DA in data science. In terms of how to decide who to let go.

I think user feedback sorry stakeholder feedback is is very important. And then just the output of their their work. How long are they taking? What is their mentality around working around constraints? So one thing I've noticed is that people can be the really good. Data scientists can be very scrappy.

If our data is not in perfect order, if there are other constraints, if they're fa facing some other sort of blockages to their work, they'll find a way, they'll find a way to deliver, or they'll be very clear and crisp in their communication. The people who don't do well are the people who can complain all the time, and they come to me to solve their problems.

I think that's fine to do, that's part of my job. But if that's. The majority of our discussion then I think there is a bit of a problem. There's a lack of effort. There was a bit of that. So just letting go of a few people and then using the backfill to bring on people who are significantly aligned with the way that the rest of the data science org is working.

I think that was quite a big game changer for us.

Richie Cotton: Yeah, certainly. I can see how if you're complaining about problems all the time, like data science is full of endless problems. Yeah, you need to have that kind of level of autonomy and self-motivation in order to try and solve stuff or it's not gonna work.

And I love the idea that just if you are mentoring informally mentoring colleagues, that's a really good sign that you're ready for management. That's a, that seems like a good indicator. You talked about bringing on new people. Are there any particular skills you were looking for when you were hiring new people?

Like what do you think are the most important skills for data scientists at the moment?

Bilal Zia: Yeah, I think the so this is something that I believe I have changed my mind about over, over the course of my tenure at Duolingo. So when I first came, I thought that if we hire the best technical folks, they're gonna be successful at Duolingo.

And there's a reason for why I thought that. And the reason I thought that was I came from Amazon and Amazon is a very different. Place in their growth trajectory than Duolingo. So Amazon's already pretty well established. They're on the flat part of their S-curve, if you wanna think about it in business cycle terms.

So in order for a data science team or a science team to have an impact, the level of rigor needs to be super duper high. And who are the folks who can do that? The people who have really high. Technical skills and everything else doesn't matter so much. What if they're terrible at communicating, that's okay.

We're never gonna put them in front of a stakeholder, et cetera. Which is also fairly standard in, in some of the big tech companies at Duolingo. Things are very different here. Even our most junior team member gets a chance to present to the CEO. Technical skills are, will get you through the door.

But what will make you successful is how well you can communicate your ideas in simple intuitive terms. And that's where people were just really struggling and we had so many candidates who came in, they a star analytical brainstorm the A, our technical tests. But then when we ask them to present to us, do a presentation, a minute presentation on any topic you wanna talk about.

They completely failed. So either their talk was overly technical that they lost everybody in the room, including data scientists let alone the product managers who were in the room or their talk was so simplistic that the data scientists in the room was like I don't really understand the contribution you made here.

So finding that sweet spot between being able to communicate. Technical expertise and technical concepts in a simple, intuitive way is probably the most valued skill at Duolingo. We are very selective in our interview process to find those types of people. It does, it is painful because we have to go through a lot of interviews.

Most of the people who come through that pipeline are at stumble at some point. But the people who make it through o obviously people make it through because we do hire those people are. Have been very successful at Duolingo. So I feel like technical skills for sure. Yes. But then being able to communicate that to stakeholders, partner with them closely, bring them along, is are.

Equally important priorities for us.

Richie Cotton: Okay. Yeah, certainly finding that communication sweet spot something I spent my whole career looking for. Yeah. Incredibly important stuff. And it's interesting that I guess the small, the company you are the broader the skillset you need. So you need the technical skills and the those communication skills.

Whereas a larger company, you can get away with being a bit more narrowly focused. Okay. We've gone minutes without mentioning ai, which may be, I think, a record for this year. But I'm curious as to whether the rise of generative AI has changed the sort of the profile you're looking for in data scientists.

Bilal Zia: Let me just speak to it at a company level first. So I think Duolingo has been an adopter of AI even before. And generative AI became such a such a big sort of thing in the last couple years. So we've been, using AI in the way that we generate our content. We have used AI in tweaking the difficulty of exercises that our user sees based on the exercises that they've just done.

And most recently we have. I a conversation practice tool where one of our world characters has a live conversation with you so you can practice speaking in the language that you're learning. And that's been a pretty big game changer for us because speaking has always been the monkey in terms of learning a language.

But now thanks to the advances in LLMs, we can actually have an intelligent conversation at a very low cost with our users in their learning journey. So we are definitely pretty high up in terms of the AI adoption. For data science in particular. My belief is that AI is not gonna replace what data scientists do do I believe it?

AI is gonna augment what we do. I come from a research background. I spent some time in academia before I switched careers. And one of the really important productivity drivers in academia are research assistants. So these are poor graduate students who we, who faculty members is really torture and get a lot out of.

And pay them no money at all. So I think of AI as basically an army of research assistants. So every single data scientist can have access to an army of research assistants. And these research assistants can go and do sort of simple tasks, even more complicated tasks, but definitely simple tasks that are repetitive, that are predictable and AI can go do this.

From my perspective there. Is still a strong need for a human in the loop. So there needs to be the supervisor who's sitting on top of these RAs and making sure that they're doing good work, auditing their work and then being the final sort of quality screen before the work actually gets. To stakeholders.

So it, that's the way that we have been thinking about ai and that's the way I think it's gonna evolve for us over time as well.

Richie Cotton: Yeah, that's interesting. And certainly I've been a long time dual lingo user, so Yeah. It's always been very good for the sort of reading and writing side of things, but having that actual conversation thing, that's a big game changer compared to just I'm saying something out loud to myself.

Yeah. Fantastic. Technology and innovation there. Going back to your data science examples of being able to automate repetitive tasks, have you got some specific examples there of things you have automated using ai?

Bilal Zia: So one of the things that we're working on and we're still playing around with this tech, is being able to generate, querying our database in in a simple, intuitive way.

So typically the way that you will query a database is to write a SQL query. And SQL is a specialized language. Not everybody knows how to write good sql. And typically data scientists would get tasked every time. So we have some internal tooling that allows us to query some of our data. And it's pretty good.

But the, what gen AI has enabled us is to be able to query that data in a conversational style. You can basically go to the interface and say. Show me the average DAUs in India from this date to this date. Or whatever you want. So literally like you're having a conversation. So that has the potential to unlock a lot more usage from across the company.

Anybody who wants access to that, the way that we benefit from that is that those people are not coming to us anymore. So there's an opportunity cost, right? So we save time and then we can spend our time. On building the next big thing. So we are investing in that technology building, that interface, and we believe that's gonna be a pretty big help for the rest of the company.

Another place where we're investing heavily is in our business intelligence capability. So business intelligence, for example, one aspect of that is anomaly detection. So we have metrics that we track. And if those metrics are at some point in a day or in a week, they deviate from normal trend, that typically is a red flag that something has gone wrong or something has gone really well.

It depends on which side the deviation is. And we wanna know about what it is and why is it, and what can we do about it? All of those questions typically a data scientist would handle including detecting the anomaly in the first place. So what we have, we are investing in automating, is moving away from a world where a data scientist literally has to open dashboards every morning.

To an AI doing that so it's automated so we can get more sleep in the morning. That's great. And not only that, but the future that we hope we get to is a place where. Not only does the AI detect the anomaly, but it actually even runs the first set of SQL queries that a data scientist would run anyway.

That, okay, here's the deviation. Let me run a few data queries to understand what are the key drivers, which segments is this affecting? Which geographies is this affecting the most? And then present that data to the data scientists when they log in. So it's not just there was an anomaly. Okay, let's figure it out.

There was an anomaly. Here's what's prevalent. Here's the way it showed up in our DAUs, and then it's up to the data scientists to do the next step. And by that time, like. %, % of the work's already done. So I feel like that's a productivity accelerant.

Richie Cotton: Yeah. I say I love that your main metric for success is like, did I get a lie in am I sleeping better?

It feels like an incredibly important thing. And I guess, yeah, you've, you solve the problems that keep you awake at night. That, that's a good, decision making process

Bilal Zia: sleep is super important. As I'm getting older, I'm realizing. Absolutely.

Richie Cotton: Alright. Yeah, I like the second example about automating monitoring stuff because I think this is something that software teams in general do very well, particularly infrastructure teams, but then it's not really been that pervasive in data science until maybe recently.

So yeah, that, that seems like a very good. Use case on your first use case. This is about getting better self-service analytics. I think so. I always thought this is being something that helps. Like other teams, your commercial teams or less technical teams. But actually you're saying it's really it's a benefit to the data science team because you're not being bugged all the time to, to jump in and do stuff.

I just found the hard part is getting other teams to adopt it and have that confidence to do things themselves. Can you talk me through how you've persuaded all these less technical teams to actually go and do their analy their own analytics?

Bilal Zia: Yeah. I think it's still a work in progress. I, it's a, it is a good thing that you're, pointing your finger at I think the way that we've done it is we have these pilot sessions where, or pilot periods where we invite the more data savvy, non-technical users into the pilot and they volunteer because they are very interested in this capability. And we have them play around with it first and then ask a ton of questions and that helps us make the prompts better, that helps us make the interface better, et cetera.

And we're at that stage right now. So where we have the more data savvy folks helping us make the product better. And then ultimately the goal is once they're satisfied with it, then they can help evangelize with their teams and their direct reports. And for a product manager who say, is not so a tech savvy, they're more likely to listen to their manager who is a PM rather than to a data scientist.

So we believe that way of evangelizing is likely to be more successful.

Richie Cotton: Okay, so I like this sort of order of who gets to trial these new ideas. So it starts off with the most technical people and so your top data scientist, whatever, gradually moves on to product managers and then it's everyone else.

So yeah. Interesting flow. Yeah, I'm a big believer that you can do the best data science in the world and if nobody uses your output, then it's useless.

Ah, okay. Of course, yeah, you've got zero impact, but he's doing fun stuff by yourself. Okay, so actually in terms of who you have, who the data team has to collaborate within, would you say a product manager is the most important people from your perspective or the other teams where you think, or other roles where you think, okay, we really need to get on the good side,

Bilal Zia: it's essentially the way that.

Decisions get made at Duolingo or historically have been made are basically the EPD trifecta, so that's engineering, product and design. So you have engineers, product managers, and designers. And then now increasingly we are the fourth leg in that stool, so it'll be EPDD, so it. Engineering product design and data science.

Many teams across these pillars, we are already considered a integral fourth pillar, a fourth stool, a fourth leg in the stool. And in some ways, the way to summarize the way I think about. The role of a data scientist is that a data scientist is basically a product manager with a technical hat.

They think like a product manager. They think as owners of the product, they're invested in its success. They're all constantly problem solving and they're bringing data to come up with the better hypothesis for what to do next.

Richie Cotton: That's very cool. I've not really thought of data scientists being a type of product manager.

Suppose you are a data scientist, you want to get more product focused. What's step one in doing that?

Bilal Zia: Just investing more time in the context. So I can give you my own example. So when I joined actually I really do credit our CFO mats, groupa. When I joined so I joined from a culture where basically it's, you join and you hit go, and you go.

You don't you're, you have no time to for better or for worse. It's just a very sort of fast-paced culture. I really enjoyed it when I was there. When I joined Duolingo, the first piece of advice that he gave me was, bilal, you've just joined. We have a very rigorous selection process.

Obviously we've been looking for ahead for two years, and now finally we have you. So I don't have any expectations from you for the first month. I want you to sit and be a sponge and be and get context of what Duolingo does, what our product teams work on, how they work on things, what they, what have we tried in the past?

Why has, have things failed, if they have failed, and why have things succeeded? Build that mental model. And then augment it. And I really that advice really resonated with me. I passed that along to everybody who joins the company and talks to me because I feel that's really important.

Without the right context, you can do the best data science in the world. You can build the best economic models in the world, but they're just not gonna, you're just gonna be shooting in the dark. A and then. Probably misfiring most of the time. And if you have the context right, you still get a for effort even if your solution is %.

Richie Cotton: I do love the idea of just making sure that new hires just get to soak up a bit about company culture. What is needed? What are the problems you're actually trying to solve? So having that like thorough onboarding just seemed like a good way to set people up for success.

Bilal Zia: Yeah, we now invest a lot in preparing the onboarding.

Documentation. So we have a whole basically a master doc that provides for each pillar, depending on which pillar you're joining, a history of what teams have been working on, what types of problems links to the experiment dashboard, the types of experiments they run. A list of people that the new hires should meet in their first day, their first week, the first month and just to give them a head start.

To start developing those, that context and those relationships,

Richie Cotton: that seems incredibly useful. Alright. So I'd love to zoom out a bit to the company level. There's a lot of companies where there's a few buzzwords being thrown about they're trying to become AI first, or AI ready, or AI enabled.

What does that mean for you as a manager at Duolingo?

Bilal Zia: I think the way that we think about AI is how can it actually enhance our productivity? A lot of, i, I believe a lot of hype around ai, at least initially, has been around cost cutting. How can AI be cost cutting? I think that's important, but I think more important is, or more exciting for us as a company is how can it actually make our product better?

So the example that I gave you on the video call feature where you can actually have a conversation is an example where it's enhancing our products. Unlocking a capability for providing speech practice, which never existed before. The only way to do that before AI was to, for us to actually hire tutors human tutors and interact and for them to interact with users.

And when we have million DAUs, that's just not possible. It's is not feasible. So AI has unlocked that. So our stance towards AI is we are invested. We are invested in making AI. Improve our product. And that's where we see the promise the most.

Richie Cotton: I love that. 'cause yeah, certainly like cost cutting is important, but it's it's only your CFO's gonna be really excited about it.

Whereas actually, if you are trying to improve the customer experience, you're making a. million people happy

Bilal Zia: and if we sit and do nothing, AI costs are gonna go down because of everything that's happening in the industry. So the costs have been coming down because of new models coming out, things getting cheaper, the foundational labs making better and cheaper models that go faster with less compute.

So the costs are gonna come down on their own through market dynamics. So even if we do nothing, that cost component is gonna help us.

Richie Cotton: Okay. And do you have any advice for other data leaders on what they should be doing to encourage their data teams to have more use of ai?

Bilal Zia: I think definitely creating an environment where they have the incentives to try different AI tools.

And not penalize when there is a productivity dropout. So a lot of times we use. AI tools and they really suck. They're bad. And then and then we evangelize that learning among everyone else, and then we or we sort trial and error more to see what works, what doesn't work, but just making room in the team to to do that.

And for not, for that, not to be punitive, I think is really important because that's how you discover what's useful. An example for how it's been really helpful for me is custom gpt. So I have a custom GPT that I call something like a leadership ally. It's been a game changer for me because it's like having an executive coach.

It's not perfect. A lot of times it's very agreeable, which I don't like, and I've tried to prompt it to be very critical. And it does at times a really good job. But it's been very helpful in the way I develop communication with the rest of the company, the way I communicate with my direct reports, the way I communicate with the team.

And I find that super duper helpful. It's been a very big game changer for me.

Richie Cotton: I don't have the idea of having something to just second guess or you just, or at least critique your decisions. But yeah, certainly, aI being too agreeable. It's a common problem. It's I want you to tell me things that are wrong.

It's oh yeah, you've done a great job here. Yeah. Okay. I think one of the big fears is around job security with ai and are you gonna automate yourself away? I know you talked a bit about psychological safety. Do you have any tips on how you can ensure that kind of that your team feel comfortable around that?

Bilal Zia: It's an interesting question around job security. I so if you believe my hypothesis about. AI being augmenting, not replacing productivity, then you can imagine that the demand for data scientists is actually gonna go up rather than down. Because if we can do more than the out, the marginal cost of output that a data scientist produces goes down because each data scientist can do more.

And if that's the case, the co every company is gonna want more data scientists because they can do more with less. In my view I believe in that hypothesis. Giving the teams the right incentives to play around with ai, get comfortable with it, use it in their work. I is really important and for data scientists to be open-minded about it, not close-minded about it because AI is definitely here.

It's not going away. It's the productivity enhancer. They should think about it that way and they should find ways for in which it's super helpful for them.

Richie Cotton: Absolutely. Yeah. I feel like in most companies it's not like there's a shortage of data science problems to be solved. So productivity enhances productivity boosters, they're gonna be incredibly useful.

I don't think there may be some companies where they're like, okay, data time is just too expensive. We'll get rid of them. But if you're showing your value, you've got that trust with management, then it ought to be an overall benefit, I think. Okay. Wonderful. Alright can you talk me through what are you most excited about in the world of data and ai?

Bilal Zia: Yeah, it's a good question. We do a lot of so this is aspirational. I believe you're asking an aspirational question. I believe the future, what I'm gonna discuss is something that I believe is the future of how we think about sort of experiments and testing and AB testing in particular.

So we are very. Big into AB testing. We run hundreds of AB tests every quarter at Duolingo. Every single product feature that gets shipped out is AB tested first. And this is true for a lot of companies, not just Duolingo. Now, one of the perils of AB testing is that there typically is a blast radius around AB tests.

Around %. I think on average, ab tests are not successful. They don't get they don't get launched. So our capability of generating great hypotheses is not very good. So we test things and then and then half the time we're wrong. It doesn't work out. And what happens with those things that don't work out is at times things actually go get worse.

That we do more harm than good. In some experiments. So there's a blast radius. What if there was a way for us to actually pre-assess the blast radius and not do experiments on humans that, or on our users that would actually not benefit them? I believe there is, there's a lot of research going on in this but there is definitely a world where we will be doing synthetic AB testing.

So instead of real users, we can actually use LLMs to mimic the behavior and decision making of actual humans, and then be able to run hundreds of AB tests, not in a quarter, but overnight. Now, there's a lot of problems to solve around this. There are problems on how best you mimic human decision making.

There's problems of cost, but I believe this is. This is the future and I'm pretty excited about it.

Richie Cotton: That's a pretty radical idea, the idea that you've got. Eight or bots instead of real people. You can run your a b test against those and see what performance best you're doing optimization in a safe way before you're exposing your users to sketching new features.

Or I really like that as an idea. Yeah. I guess fingers crossed we can make it happen or someone can make it happen. Finally I always want new people to follow. Whose work are you most excited about at the moment?

Bilal Zia: Along the lines of this type of synthetic AB testing.

There is a team at Amazon where my former team actually is working on this, and then they're collaborating with some really top academics. So there is an economist at MIT Victor Turnoff, who I he's a very famous econometrician. I follow his work. And he's involved in some of this, they just released a paper.

I believe it's called age Egen Economic Modeling. So that's basically a, an economics based framework around thinking around about synthetic AB testing. So I do follow his work quite a bit. Yeah, that would be something that I'm or at least a person that I'm following currently.

Richie Cotton: Okay, wonderful. Yeah, it just seems such a fascinating topic. I think definitely something to dive into in, in more depth maybe a future data framed episode. Wonderful. Alright thank you so much for your time, Bilal. Thank you.

Topics

Artificial Intelligence

Data Science

podcast

How to Build AI Your Users Can Trust with David Colwell, VP of AI & ML at Tricentis

Richie and David explore AI disasters in legal settings, the balance between AI productivity and quality, the evolving role of data scientists, and the importance of benchmarks and data governance in AI development, and much more.

podcast

The Data to AI Journey with Gerrit Kazmaier, VP & GM of Data Analytics at Google Cloud

Richie and Gerrit explore AI in data tools, the evolution of dashboards, the integration of AI with existing workflows, the challenges and opportunities in SQL code generation, the importance of a unified data platform, and much more.

podcast

How To Get Hired As A Data Or AI Engineer with Deepak Goyal, CEO & Founder at Azurelib Academy

Richie and Deepak explore the fundamentals of data engineering, the critical skills needed, the intersection with AI roles, career paths, interview tips, and the importance of continuous learning in a rapidly evolving field, and much more.

podcast

End to End AI Application Development with Maxime Labonne, Head of Post-training at Liquid AI & Paul-Emil Iusztin, Founder at Decoding ML

Richie, Maxime, and Paul explore misconceptions in Al application development, fine-tuning versus few-shot prompting, the roles of Al engineers, planning and evaluation, challenges in deployment, the future of Al integration, and much more.

podcast

Scaling Enterprise Analytics with Libby Duane Adams, Chief Advocacy Officer and Co-Founder of Alteryx

RIchie and Libby explore the differences between analytics and business intelligence, generative AI and its implications in analytics, the role of data quality and governance, Alteryx’s AI platform, data skills as a workplace necessity, and more.

podcast

Building Diverse Data Teams with Tracy Daniels, Head of Insights and Analytics at Truist

Tracy and Richie discuss the best way to approach DE & I in data teams and the positive outcomes of implementing DEI correctly.

See More See More