How an Always-Learning Culture Drives Innovation at Shopify

Ella Hilal, VP of Data Science and Engineering, Shopify shares insights on the biggest mistakes data scientists make, being an effective data leader, how to create cohesion between teams, and much more.

Aug 28, 2022

Transcript

Guest

Ella Hilal

Host

Adel Nehme

Key Quotes

Being understood Is more important than being precise when speaking about data initiatives to key internal stakeholders. If you talk to a stakeholder using complex data language with the best intentions, what you’re saying won’t register for them, putting both of you on the losing side of the conversation. But if you simplify it to what really matters, and they are able to act on your discoveries because they understood you, that puts you both on the winning side, instead.

One of the biggest mistakes data scientists make is always trying to be at the cutting edge of technology. They run to the shiniest thing right away, but the shiniest thing isn’t always the most important or practical thing. To be effective and to attain mastery in your craft, you need to understand what exact tool to pull out of your toolbox to solve the problem in the most impactful way. To do this, you need to build a sense of iteration and incremental shipping. As you gain experience, you get better and you can use more sophisticated techniques at will.

Key Takeaways

To be effective and to attain mastery in their craft, data teams need to understand what exact tool to pull out of their toolbox to solve the problem in the most impactful way.

Being understood Is more important than being precise when speaking about data initiatives to key internal stakeholders.

As a data leader, you create space where you empower your team members to learn and grow, even if you believe you can do something faster or better.

Transcript

Adel Nehme: Hello everyone. This is Adele data science, educator, and evangelist at DataCamp. One thing we keep thinking about here at DataCamp is how important a culture of continuous learning is for data teams. Data science is still relatively nascent compared to other technology disciplines like software engineering.

And too often. Now we see new frameworks, new tools, and new ways of working for data. This definitely requires a culture of continuous learning for data scientists. And this is why I'm so excited for today's episode. Ella Hela is the VP of data science and engineering at Shopify's commercial and service lines division. She is a well-seasoned data leader with an extensive resume that I'm not doing justice with this short introduction, she's led a variety of projects and is an expert in areas such as data analysis, machine learning, autonomous systems, and IOT to name a few. She's also an incredible learning advocate for the data scientist that she.

Throughout the episode, we speak about her experience, leading a data team, as a Shopify, how data scientists can develop a continuous learning mindset, and how data leaders can create space for innovation within their teams. Some of the use cases that she's worked on in her biggest learnings from them, and much more, if you enjoy this episode, make sure to rate, comment, and subscribe, but only if you like them now onto today's episode. Ella. It's great to have you on the show.

Ella Hilal: Thank you. I'm really excited to be here.

Adel Nehme: Awesome. So I'm excited to discuss with ... See more

you the data science, powering, Shopify, how you approach, and always learning mindset, how you lead data teams, and more, but before we begin, I'd love to talk about your background and how you got to where you are today. So can you briefly walk us through your journey and how you joined Shopify?

Ella Hilal: So I would start with, I'm a girl from the middle east. I actually went to university in Cairo and I studied computer engineering. Then I traveled to do my master's. I did my master joint leave between Cairo and. University in Germany, which was amazing. I got to learn a lot. I did take some courses from sugar university.

I got to visit the different campuses around Germany. And then I went at the time I had full scholarships from Fulbright from de day and in Canada, too, from OGs and insert. And I ended up landing at the University of Waterloo where I started machine learning in AI. And then I graduated and then I'm not gonna take you through my whole career, but maybe I'll give you a couple of highlights.

I have a Ph.D. in patent analysis and machine intelligence. I started my career job, as a developer because when I started data, science was not a thing at the time. And then from there, I started leading innovation teams and then started data teams, and then grew into leading the data science organization in a company called intelligent mechatronic systems.

Then I moved into Shopify as a director of data for plus, which is a large-size merchant and international plus is like a lot of our big merchants, like, Tesla, general electric, and some of the Kardashians. Like, name it. Anybody who is anybody is on there. We have lots of very amazing, very talented merchants building their own brands international.

We started with the mission of making Shopify a perfect market fit for all the markets we're in. We're already, we're in 175 markets, but we were started with the intention of making, making it feel like solving local needs, not just a platform, a global platform that operates in regard of the needs of the merchants. And from there grew into leading the growth and revenue organizations. And now I'm the VP of data science heading all the commercial and service data science.

An Always Learning Mindset

Adel Nehme: Given your extensive experience as a data leader at Shopify, you know, one thing that I've seen you speak about this definitely requires it to be able to succeed in such a role is developing an always learning mindset, which is really what I wanna center today's episode about. So I'd love to set the stage for our conversation today, about how you define an always learning mindset and why you think it's so important for data scientists to progress within their career.

Ella Hilal: I think this is the most important superpower data scientist can have. Like, I talked to a lot of leaders in the data science world and they would talk, oh, oh my goodness. We need somebody with a Ph.D. and we need somebody with a master's or X or Y or Z. And I was like, don't get me wrong. I do have a Master's. I do have a Ph.D., but I don't think that's what makes a good data scientist. I actually think the good data scientist is the one who has this learner's mindset and learner's mindset. I define it as the person who is able to go back and learn. Doesn't get stuck in what they know. They are actually able to continue collecting additional tools, additional formats of thinking of data, philosophies, and mindsets, along their journey to add to their toolbox. The data science craft in general is evolving rapidly and it's relatively in an early stage compared to engineering and other crafts. Who've been there for like many, many more years. And because of that, things are evolving. Fast. Frameworks are evolving fast and we're coming with new techniques and new approaches all the time.

The trick is not knowing the latest all the time, but the trick is being able to learn. New techniques when the right questions and the right setups come. So it's not about the shiny new thing. It's about picking the right tool out of your toolbox. And if it's not there going and finding it and getting it and adding it and learning how to use it.

Mistakes Data Scientists Make

Adel Nehme: That's really great. I'm excited to expand with you the methodology for learning here that you've acquired along the. One thing that you mentioned is one data science is not relatively mature yet compared to other fields, but also data science is inherently multidisciplinary. You know, data scientists are required to blend two broad skill sets to be able to deliver value. One of them is business acumen, right? Knowing the product that you're working on, having communication skills, being able to work with collaborators on business problems, but also technical skills, nuts, and bolts of data science, as we say. So starting off maybe with a technical skill set because that's arguably the more comfortable one to grow in as a data scientist, a lot of the growth data scientists have on the technical side comes from actually learning new tools and experiments on the job.

As you said, however, given the importance of delivering value in the near term, how should data scientists maneuver the trade-off between applying tried and tested techniques to solve problems and learning and experimenting with new tools that may not pan out to deliver Short-term?

Ella Hilal: So I will divide this answer into two. I almost believe in focusing on. And by focusing on impact, you can always iterate. Like I, I believe in incremental shipping. So what you can start with is let's say a simple thing. You're asked to do forecasting for like a top-line metric. You can go with the latest, coolest, and fanciest paper that was published about this neural network that allows you to optimize a hyper model with many parameters.

And then like, takes that to do something with like some form of. Hyper tuning for some regression or whatever, you can go with these, some very complicated techniques multilayered right away. But don't get me wrong. Yes. You learned something cool. But did you really solve the business problem? Did, did you really know how to use it effectively?

Did you know effectively or baseline? I don't think so. I think the right way to go about it is to start with your simplest, you know, what less line fit. From, and then take it one more step further. Let's apply some, some linger regression and maybe, you know what, let's do some logistic regression, maybe then we, and as you iterate, you understand the progress and you understand your data, you understand your different parameters, you understand the levers that you're pulling.

And then as you iterate, you're actually finding more and more and learning more and more and have better understanding why you are leveraging and using this. One of the biggest mistakes that I see data scientists do is to try to be at the cutting edge of technology. They run to the shiniest thing right away, and the problem is the shiniest thing doesn't mean that this is the most important or practical thing to be effective and to be successful. And to reach this mastery in your craft, you need to understand what exact tool to pull out of your toolbox to solve the problem in the most impactful way. It's not the fanciest it’s not shiniest tool it is the appropriate right sized tool. And to do this, you need to build this sense of iteration, incremental shipping. As you iterate over time, you get better, and accordingly you can use more sophisticated techniques. The more sophisticated techniques actually sometimes blind you to why it's operating this way because it's a black box.

And like you spend a lot of time trying to throw things at it, but the truth is you're throwing things at the wall versus you're really trying to understand what levers are you pull. At the end of the day, any machine learning model is literally a line fitting in high, or like hyper plain fitting within two, between multiple dimensions.

That's it's, it's linear math guys. It's math. It's not, it's not rocket science, it's math. And if you understand that, then any new technique is not shiny. You need to understand the underlying math to choose and to understand the underlying math, you can start with the most complicated equation. You need to start from that to progress through it. So with that mindset, I think you tend to integrate and unlearn. The second thing that I always do with my teams is. We use something called BLT time or like hack days, or like whether it's paper programming time, which is a great way to learn new things, data digest, which places where people can present their work and teach each other, or hack days where you can get to experiment with new things.

So like, you always need to have some like space in new scheduled space to get, to pick new things and experiment and. But on the job on the day to day job, you can also learn through iteration and through pushing the boundary and also through experimentation. But don't start with the thing that you cannot debug and understand the why it's working or not start simple and iterate.

Adel Nehme: So there's two frameworks at play here. The first framework is constant iteration and starting off from simple. Solutions and avoiding that shiny toy temptation, cuz I think a lot of data scientists fall into that resume driven development pathway. And the second framework is also as a leader, creating that space for teams to share knowledge and experiment with new tools that may or may not be shiny. And can I present the work as well throughout the way? Is that correct?

Ella Hilal: Yeah, totally. And regardless, shiny or not the under, if you understand the why underneath you understand the distribution of your data sets, you can iterate. You can have even better ways to enhance existing algorithms or even new a.

Framework for Continuous Improvement

Adel Nehme: Now for the business acumen side, I think that's gonna be a more challenging skill set for data scientists because it blends communication skills, collaboration, product sense and more. It's not something that you learn. Necessarily in a data science education, and it's not something a technically minded person would be geared towards potentially what are frameworks and mental models and similar here as well, mechanisms within a team that you find useful to improve that skill set continuously as well.

Ella Hilal: I love this question. I can't tell you how much I love this question. So I think the biggest thing and the most important thing is that. Repeating is like as data scientists, we need to focus on the outcome, not the output. And I know the sentence is very simple, but it's so true. We, a lot of time focus about shipping an algorithm, but we don't ship the business impact.

We don't focus on the business impact, which is the outcome. We focus on shipping the algorithm. So for us to get ourselves to tie to the business impact, I think one of the key tools that like I recommend for everybody to use, and I actually reference a lot is the five ways. You need to understand why we wanna do this and you wanna actually debate it from a human double.

So for example, if I wanna tell you, build me a recommendation engine, that's a sentence that the PM product manager can come and say, it's like, the question is why. And then you can say, we need to recommend themes for merchants themes, which is like, what's the template to you? Then, so then another, why will we wanna save time, da, da, da, then as you continue the conversation, your realize that when merchants come in to start their stores, one of their highest friction points is actually choosing what theme to use for the right business.

They wanna make it unique, but they wanna. Useful. They wanna make it appropriate to the product that they're selling, but they still wanna put their flare on it. So being like this intelligent partner, like this automated, intelligent recommendation assistant type of algorithm with them tends to save them a lot of time and actually become a sounding board too.

That has a real impact for merchants. And when you understand that you can actually start with, you know, what? You actually don't need the recommendation engineer. You maybe let's start with the ranking and take it from there. And then as you get the ranking, maybe the next iteration will be a full recommendation engine, right?

Like, so you can iterate over time knowing the outcome that you're trying to drive for and using your skill sets and this massive toolbox that you've been building from our step one, to be able to pull the right thing versus acting on a specific AC ask. And the business acumen is built. By being curious, there's no other hacks.

I can give you a ton of frameworks, but all of them are founded in us asking questions and asking to understand the drivers, keeping an eye on the outcome, not on just what we're shipping. And that makes a big difference. And also you're gonna see a massive engagement difference with your counterpart. So if you're working with product managers, get to see them interact with you differently.

If you're working with engineers with even sales reps or anybody at the end of the day, you have also a shared language, and this is really important, regardless of technical or non-technical business problems or business acumen, business problems are common and shared by all the crafts working in a certain group or so now your language changes from like a data science craft language to a business common language shared across. So the bond, the connection, the alignment becomes much more amplified and faster.

Adel Nehme: That's really great. And I love that. What's nice about the framework that you will that laid out is that by breaking down a business problem into its multiple components of parts, like with the five whys, for example, To also break down a technical solution into its component parts and be iterative from there.

Ella Hilal: Totally. And also you, you get to understand that drivers not just. The translation of it by a PM. So the PM heard something and came back. Like, I, I literally was in a conversation yesterday and somebody came and was like, I want a neural network. And it's like, why. And then when we started talking about it, it's like, yes, he needs to do classification, maybe neural, the network's not the best choice for the data set.

And because again, any machine learning model has its underlying statistics. So like maybe we're overcomplicating. Well, we just, it's linear data. We just need something much simpler. So it's, it's all about this conversation to understand, to understand also. What assumptions are made when you're discussing, because you're talking about the human, the usage, like, when, for example, if we go back to the example of the recommendation engine for the themes, it also makes an assumption, as you ask these wise, you get assumption on when is the merchant gonna use it at what phase of their journey that they're doing it early enough, right?

Like they're not getting used to Shopify yet. You get to understand that they, they don't maybe. A full theme for their business. So maybe that actually gets you another idea of a different ranking or recommendation, or like whatever, additional tool that you can provide them in a separate step that can make this step easier for them, right? Like it can give you this sense of the, the merchant journey and the information around it. And accordingly you can build these different components and get to see not just that product even can be an ecosystem of products around.

Adel Nehme: That's really awesome. And we had on the podcast last year, Shri Bahar, VP of data science at go check, which is also like a highly data mature organization. And the one thing he mentions is it's. Into embed data scientists in this different business teams, simply because it enables that common business language. And it enables that skin in the game for data scientists on these solutions that they're developing. Do you share that worldview and how has that been effective for you at Shopify?

Ella Hilal: The data science scene in Shopify is a centralized craft. But we work with embedded teams. So what does that mean? That each team is embedded within their own organization? The reason is they need to be close to the business problem. Data science cannot be behind the wall where you throw things over with questions and expect like proper answers to come on the other side, because even basic questions have, has an assumption. So for example, if I tell you, how are the buyers on our merchant's website. So buyers are the customers buying from our merchants. Merchants are our own customers. Right? Very simple question. What defines a buyer? Is it the one that comes to checkout? Is it the one that just goes in to browse? Is it a session that is starting somebody just tops in and leaves? What defines a buyer? So there is these discussions and these understanding and like being close to the problem space helps build number one, this better mindset and understanding of how things work that enables data scientists to do their jobs better creates a common language between the different groups as well as creates a further, much bigger curiosity about how the product itself works.

Managing Trade-offs

Adel Nehme: Great. I love that. And really, I think this marks a great segue to how data teams at Shopify are leveling off their skillsets and becoming, you know, and adopting this learning, always learning mindset that you talked about connecting back to the trade-off, maybe between short-term priorities and longer-term innovation investments. How do you approach that trade-off as a leader and your own teams, and how do you create time for your team to experiment with new skills? You wanna maybe walk us a bit into more detail of what these programs look?

Ella Hilal: Yeah, that's, that's great. And there are multiple different programs. So we have something. I love this thing. I had this done many years ago, like seven or eight years now. And I've been using it ever since for every single team I led. It's called mini sprints. So it's the idea is similar to the idea of hack day where like, Hey hack day is everybody come and builds, but you don't need to always invoke massive scale hack days. Somebody on the team has an idea that we believe in it. Like, let's say, you know what? I, I can make this 20% better. I just need a couple of days amazing. We can invoke mini sprint. That person now invokes the mini sprint. So it's not that they by themselves will do it. You can collect people from different groups and say like you guys, four people, there's this vision, go build a mini sprint experiment with it and come back.

So the investment is small. The investment is two to three days. Sometimes I do it all the way to up to a week, but usually, it's let, like, it's like a spike. It's a. But the value of it is that it's cross teams don't have to be specific teams. It also creates high bond between the different groups that are working with it, but also it creates this space for quick innovation and experimentation to prove a conduct like similar to the idea of spikes. But instead of pre-planned within the same group, it's across the group and it is invoked by either an important business. Or a question. So that, that allows a lot of like us experimenting fast and failing fast and feeling forward, right? We're the scenario here, this team, these four or five people build a bond and we usually diversify the people.

So this way we continue building bonds and connections between the different teams. That's the worst-case scenario. The best case scenario will learn something very useful, whether positive learning or negative learning, which is like learning of stuff that didn't work or learning stuff that worked. So that's, that's the great way to do that, where it fosters like experimentation and in.

But we also have a very specific cycle called what we call the vault projects, which is proposal prototype. And then we go into the build. The build is we are building for long term we're building and ING, and being able to build like robust, reliable engineering systems, but in the prototype, this is the phase where we it's an normal cycle, normal sprint or two, but the, in the protyping is what you're standing up fast. Unlock the. And by having naming. So what I shared with you is two techniques for experimentation, as well as differentiation between fast experiments and long term builds. Why am I saying that? Because having naming for both having phases for both intentionally calling them both allows us to focus on the trade offs that we.

The problem is when you're building something fast and putting it aside and forgetting that it's fast hacky, this is where technical debt arise for you to solve for that. You need to have words and names for it, and you need to have intentionality. You need to differentiate between the quality of the output of these two phases and accordingly, if you have an output from a prototype, the expectation that it's in a.

If you're lucky if it's not an alpha, but where the output of a productionized cycle or built cycle, it's a fully productionized system. So it's more robust, more liability. So by having this, by having the intentionality, when you're building your roadmap, you're. Clearly calling out what phase this is in creates the space, the intentionality for you to ship quickly to unlock the business, but also plan for the longer term and iterate for the longer term.

Maybe one thing that I would like to touch on on here, because I know that a lot of data scientists suffer from this is ad hoc questions that tend to eat most of people's time. I think there is a big opportunity. Miss, when we take the adult question, we hate them and that's okay that like, I know they're destructive, but then we just walk away.

But the truth is the ADHA question came because there is a system that is missing or a system that is broken. If we pause or reflect, maybe do an RCA root cause analysis, like sit with the group, like, why do you think we're getting these. What is missing. You might find specific reporting that is missing. You might find specific tooling that is missing and accordingly you can move these fast type of questions into system building with an objective to reduce these. If you do this effectively, you might like, I have cases where we were very successful to reduce adult questions by 70 or 80.

Adel Nehme: Wow. That's really awesome. And I wanna kind of unpack a lot of these different initiatives and programs you have set in place, maybe starting off with the mini sprints. How do you ensure here in this situation as a leader, that the time? The team on mini sprints, right? You mentioned the worst-case scenarios that people bond, but how do you balance between the absolute objectives that we need to, you know, land this quarter and this space for the mini sprints that we need to have within the quarter? So what's, what's your, what's the barometer that you use?

Ella Hilal: I love that question. And that's part also of the mini sprint, like debt that I have, which is whenever you have an idea, it's not random. You have an idea and you just go build it. You need to bubble it off and bounce it on your leads, to do the . And if that's the case, it's an investment that the leads do because you, you never run it by yourself. You run it. With couple more people. So it's with intentionality and usually, because we're making it visible, it's not the side of this work. We get to know the beginning of the mini sprint because there's a raw. Around it like, Hey, we're starting a mini sprint, da da, da, da. And then at the end of it, people send a summary of the mini sprint. So accordingly it creates a sense of ownership and accountability. So people are not just running randomly doing this because it's not seen because it's visible people wanna do good work. And because it's communicated, people are intentional about, is it worth it or.

Dealing with Ad-hoc Requests

Adel Nehme: And maybe touching upon the last element of your answer system building and ad hoc requests. I know this is something data scientists really hate. How does that, you mentioned here that definitely like ad hoc requests create that, create that connection to understand what are the systems that we need to provide the tools that we need to provide.Walk us through maybe how self-service analytics, you can solve a lot of these problems. Right of ad hoc requests and maybe walk us through some examples of how you more in detail of how you were able to drop down ad-hoc request by 70%. Cuz I know that there are a lot of data leaders listening to the show who want to learn that secret.

Ella Hilal: Totally happy to. So the fact is like ad hoc questions are not coming for without a real business need. And if they are, we should actually say no. No, thank you. We have other more important stuff to do, but if they're coming in for the business need, let's look at what. Is recurring. And what can we see? So for example, one thing that came and the team, the plus data team at the time was very annoyed with it is the fact that every time we were doing like some email marketing back in the day, we needed to get like a list of emails and this is a PII, so it needs to go through data and we need to make sure that we do many crosschecks to make sure that we're respecting, like people who are opting in and opting out and da, da, da.

And so it was. And at the time, because the system was fragmented between like Shopify and the plus merchant, da, da, da, da. We had to do many, many steps manually. So this is, this is a problem that takes a good two to three hours of every time, a question like this comes, the problem also choose to come within. Like they already built the whole campaign and now they need it. They need it in the next 24 hours. Give it to me right now, type of thing. So if you look at this, this is definitely. Candidate for systemization you, first of all, requests need to have X amount of business days are turnaround, unless, unless there's exceptions, number two, a lot of these pieces in the system were like there was manual validations and stuff like that, that all can be automated.

So by doing this and creating right reporting with right alerts and like right checks. We just build a system and that now is not as dreadful or is not like doesn't need as much involvement of a data scientist anymore. For every time we're sending like an email for our massive scale immersion. So that's a simple example. Like you can Al always been complaining about like, oh my goodness, these questions come, but just seeing the pattern and each of them come, doesn't come in with the same data poll. It comes like, oh, we're doing this campaign. And we need support data support. Oh, we're doing this new campaign. We need data support. Same thing about like what's happening in our funnel. Very simple question. You can, again, every time answer the question or you. Exactly. Very, very complex answer. But if you do it enough times, you get to see like 70% of the answer is actually systematic charts that you're looking for. So you can then go build a reporting suite.

And I use the word suite. I don't know, say reporting dashboard. The reason I'm saying suite is because you need to think about what type of dashboards are you building and how do they interact with each other. If you think, if you think about. reporting as a data product that will set you up for success. The reason I'm saying this is because when you think about it as a product, you think about user experience, you think about navigation. You think about the, the uptime. Think about a lot of things that actually big part of the reason dashboards get abandoned in the like dark hole of dashboards. Is because we don't think about these things.

We create a lot of one off dashboards because it's easy, but we don't create a navigation between them. We don't make sure that these answer cohesive, comprehensive questions. We just like each of them answers a random piece of the puzzle, but how do we navigate this? Now we're needing to pull a data scientist to do it.

And they decide he's doing that. Cuz it's dreadful work, not cool work. So if you step back and think about it from a data product side, it becomes now a data product, and it becomes now with all the user experience that comes with it and ups running, it's easy to navigate and works a lot better. So again, this is how I solved for a lot of these things. I step back, and looked at the, our questions about coming in. And every time we see a good collection, we try to systemize by solving for the underlying root cause.

Adel Nehme: That's really great. And the keyword here that you mentioned is product right? Data product. And I think that when you develop a dashboard reporting suite, as you mentioned, having that attention to user experience and how your dashboard is gonna get consumed is something that I think a lot of data scientists miss necessarily because it is at the end of day, a digital product that people will consume. It needs to have the same type of experiences or expectations that people have out of digital.

Ella Hilal: I do agree with that. Like again, the whole idea is, to think about your own experience from data. Like if you're a data scientist using. I don't know, Google analytics or you're using your analytics on your Twitter or any of the tools that you use. What do you wanna see and what makes sense to you? And if you start seeing the themes of experiences that you enjoy and start bringing these into the dashboards you build and bringing these into the tooling that you build, it becomes like, again, it's easier to adopt and more enjoyable to use for business stakeholders and accordingly less pull on your a.

Role of Data Literacy

Adel Nehme: So we definitely talked about how creating these systems for the wider organization helps out in one, reducing the workload for, for data teams, but also helps out accelerating data driven decisions, improving business outcomes across the organization, right. And automates a lot of different tasks. How much does data, culture and organizational data literacy for non-technical stakeholders play a role in creating consumers for the data teams output.

Ella Hilal: That's a great question. And I'll tell you, it makes a huge difference. However, most organizations like when you start the group and interactions is similar to any relationship, right? Like you don't start with everybody knowing how exactly to work with each other perfectly. Even if they're coming from either driven previous role or organization, what have you doesn't mean that like, It's just gonna click.

So by having high intentionality and showing value repetitive, it tends to elevate up the data understanding. So we do have in Shopify, like many courses for non data scientists to up level on data science. So like, how do you understand charts? Or like how you're at SQL if you're interested or any of that.

But I think the key, the real key that makes a pivotal change is. Having the right level of conversation and showing value. If you're talking with complicated equations, you lose people. If you're talking with a language and that goes back to the business acumen piece, you go back to talk about the business problems, which is a common shared language, regardless of the craft, people tend to listen more and then to understand more. And it's on us as experts in our domains to be able to play this translator role where we talk from a business perspective and doesn't mean that we take it down or not like talk fancy, but like, it means that we talk with what really matters, which is the business and the impact on, on, on the customers, the consumers, like, I don't think. Talking with very high precision when it comes to the data science crafts serves us better when nobody understands, I think being understood. Is more important than being precise when you're talking, like, if you're talking about your F1 score and your sensitivity and your precision and your false positives.

And you're like, if you're talking about all of these things, all of these, like we all use them in day to day life when we're talking to each other. But if you talk to a business stakeholder and you're talking about all of this and all of that, just like doesn't register at all in their head, then you are both on the losing side of this conversation. But if you simplify it to what really matters. And they are able to action your learnings because they understood it. You're both on the winning side of this. So it's really important to keep that in.

Adel Nehme: I completely agree with the last point. And I think it's extremely detrimental for data teams, because if this happens in front of an executive, What you're gonna have is loss of executive trust in the data team and less investment in the data teams, longer-term output and work product.

Ella Hilal: A hundred percent. So I will tell you something funny. I actually did see that. So for example, a data scientist runs an experiment and the experiment is set up as an EB test, but of course, anything that is set up, has some form of caveats. So the data scientist comes in and is sharing the insight. With SLT. And this is a true story.

I'm just like abstracting and the data scientist wants to be so precise in the words that they're using. So they went in and the experiment was like had a, a positive impact. Their intention in getting into this meeting is to advocate for like rolling this experiment out to everybody. And they went in and to be precise at trying to be unbiased, they did so much listing of the caveats.

That is what happened is people in this meeting just assume that this experiment is useless and they all like to distract it. Although it was rigorous, it was done right. There was proper significance. Everything was right. It's just like, again, we, this data scientist got in their head so much and they talked so much with the data science language that what happened is the opposite outcome of this, their intention when they went into this meeting.

Adel Nehme: That's a great story. And probably like, would've been better off as said, you know, Hey, I run this experiment. This is what we should do. This is what you can do. This is the exp, like the expected. And if you wanna read the appendix here's the appendix.

Ella Hilal: Exactly. Or even if you wanna say caveat, it's fine, but don't list everything you do.

Adel Nehme: Yeah.

Ella Hilal: might have happened in the whole wide world, just in case for like, with a pro, like it doesn't work.

Interesting Use Cases in Data Science

Adel Nehme: That's a great example. Now, Ella, as we reach the end of our chat, I'd definitely not be remiss not to talk about some of the data science use cases that you've worked on, on shop at Shopify. So what were some of the highest impact data science solutions that you've developed that you can publicly share?

Ella Hilal: Of course. Wow. There is a lot of cool ones. So I will tell you definitely, we talked a lot about Shopify capital, which is offering loans for merchants to like scale their businesses, which is amazing. It's very dated. S data driven product, and it definitely does have a massive impact on merchants and their life. We also have Shopify balance.

We do have our product classification, as well as our audiences, what we call audiences, which is like enabling merchants to market better, which is the return on investment on ad spend for merchants so that they can actually scale, which is pretty, pretty darn cool, because think about it like raw tools. Organization that build Ross is actually usually either very data driven. So they already have large data teams or they use third party tools to help with that. This is actually part of the Shopify offering, which is pretty cool. I, some of the ones that I'm personally very excited and invested in, like some of them are internal, so like our own forecasting family of algorithms.

And like within the economical environment that's happening now, forecasting G or forecasting Mer merchant count or, or any of that is pretty hard problems. So this is pretty cool. The other ones is like best next action, which is recommendation engine. I was telling you, which is when Shopify merchants start. Starting a business is not easy. There is a higher probability of failure because like entrepreneurship is hard, not Shopify aims to make it as simple as possible and removing as many barriers as possible. And because of that, like we do have this recommendation engine, which is best, best next action, which helps you. Becomes your partner in your early journey to make sure that, to get you to a successful start on Shopify as well on entrepreneurship in general. So there is a lot that to be excited about and to be proud of.

Adel Nehme: I love these use cases. And what I love the most about them is that, of course there is a lot of value for Shopify that is generated from these use cases, but it really also provides a lot of value for would be entrepreneurs who would not have become entrepreneurs without these use cases potentially. And that's amazing to see. So connecting back then to the theme of the episode, kind of final question for my side, what were some of the biggest learnings for you from working from these project?

Ella Hilal: Yeah, that's a great question. So reflecting on it, I would say number one is, as I shared earlier, like, Start simple because when you start, simply create a baseline and you understand what's possible with the lowest friction points. So even, even something like best next action, instead of starting, we didn't start with like the fanciest algorithm that we currently have. We just started with. Okay. What about we just. these organize this list. Like we're gonna do analysis and like force organize them. And then maybe we stacking them automatically. And then maybe we feed this machine learning in and then we create it on it. So starting simple made us understand the impact.

We experiment it through. So that we learned the value as we iterate, making sure that we got check our hypothesis. So number one, start simple. Number two, experiment to learn, iterate to also not to fall into confirmation bias, right? Like to make sure that you're, you are really gut checking last but not least. Creating a space for experimentation and mini sprints actually tends to surprise me every single time. Like I am. I'm a big advocate of it. A lot of our cool internal solutions started as a mini sprint that then stood up to become a fully productionized product after. So this was very helpful and I would definitely encourage for us to continue doing that and for others too.

Adel Nehme: That's really great. And maybe, you know, on a personal note as well, what were some of the biggest learnings for you from going from an individual contributor to someone who manages data teams as well? Because that that's a jump as well. That's not talked a lot about in data science with its challenges and the different ballpark that you are in as a data leader.

Ella Hilal: I'll tell you honestly, every day is, is a learning, but I'll tell you back then when I did this transition, I did it many years ago, but I think the hardest thing that, and I still see people who are moving from an individual contributor into a leader, uh, struggle with is knowing to trust, to let go and create this space for others.

To learn and fall forward sometimes as an individual contributor, especially when you're at the top of your craft. And this is why you got promoted into manager, you think it's like, oh, you just like, I can do it in 15. Yes, you can do. Maybe you can do it in 15 minutes and that other person might end up doing it in two hours, which is like eight times how much you do it.

But like, if you let them do it in eight hours today, tomorrow, they will do it in two hours, which is like eight times the time. Like if you let them do it into two hours, then tomorrow they will do it into one hour. And then the day after they will do it in half an hour. And then you skilled yourself. as a manager, don't forget that your job is to work through others and lift them up around you because it's not the, like the best managers are not the smartest people at the table.

The best managers are the ones who have very strong people around them, where everybody on the team lifts each other up. So that's a key reminder. It's not just hiring great people and getting out of their way. And I know this is a very popular code from Steve jobs. It's hiring great people and giving them a space to learn and to level you up and you level them up.

So it's an environment of shared learning and I always call it collaborative intelligence cuz you get together smart.

Call To Action

Adel Nehme: That's such an awesome ending. Now, finally. Do you have any final call to action before we wrap up today's episode?

Ella Hilal: All I can say is maybe my final call of action is like data science is a great field and there is a lot that we can do to still shape it. So have fun. Don't get stuck at a tool or method, or just like. Focus on the business problems. This is our superpowers. We are problem solvers, data scientists are problem solvers.

So focus on that. And I think a lot of good would come after.

Adel Nehme: Thank you so much. I luck for coming on data.

Ella Hilal: Thank you. I was excited to be here and I'm happy to have the conversation. Thank you so much for having me.

Topics

Data Skills and Training

Data Literacy

Data Leader

Software Services

blog

5 steps to building a learning culture in your organization

Creating a learning culture that sustains continuous learning is foundational for the ability to succeed in the digital age.

Kevin Babitz

6 min

podcast

Data Science and Online Experiments at Etsy

Discover how data science and online experiments impact business and decision making at Etsy.

podcast

How to Build a Data Science Team from Scratch

Elettra Damaggio shares how data leaders can balance short-term wins with long-term goals, how to earn trust with stakeholders, major challenges when launching a data science function, and advice she has for new and aspiring data practitioners.

podcast

How Salesforce Created a High-Impact Data Science Organization

Anjali Samani, Director of Data Science & Data Intelligence at Salesforce, joins the show to discuss what it takes to become a mature data organization and how to build an impactful, diverse data team.

podcast

The Path to Building Data Cultures

In this episode of DataFramed, Adel speaks with Sudaman Thoppan Mohanchandralal, Regional Chief Data, and Analytics Officer at Allianz Benelux on the importance of building data cultures, and his experiences operationalizing data culture transformation programs.

podcast

Organizing Data Science Teams

What are best practices for organizing data science teams? Having data scientists distributed through companies or having a Centre of Excellence? What are the most important skills for data scientists?

See More See More