Introducing Elettra DaMaggio
Adel Nehme: Hello everyone. This is Adel Data Science evangelist and educator at DataCamp. As data science becomes more and more integral to the success of organizations. Now more than ever organizations of all sorts and sizes are building data science functions to make the most of the data that they generate. However, I think given all the Data Framed episodes we've covered thus far this year, it is definitely no easy feat to launch a data science function from scratch. So I am excited to have Elettra Damaggio on today's podcast.
Electra is the Director of Data Science at StoneX. She has been deeply embedded in the data and digital transformation space and financial services and played a crucial role in launching the data science function at StoneX.
Throughout the episode, we talked about the main challenges associated with launching a data science function, how data leaders can prioritize their roadmap between low-hanging fruit and long-term vision, how to earn the trust of stakeholders within the organization, as a data leader, use cases, she's worked on, advice she has for aspiring practitioners and much more.
If you enjoyed this episode, make sure to rate, subscribe, and comment, but only if you liked it now let's dive right. Elettra, it’s great to have you on the show.
Elettra Damaggio: Thanks guys for having me.
Adel Nehme: I am excited to talk to you about your work. Leading Data Science at StoneX, best practices for launching a data science function from scratch, how to manage short-term objectives and long-term priorities, and more, but before that, can you give us a bit of a background about yourself?
Elettra Damaggio: Yeah, sure. So I started studying computer science a long time ago. So I graduated, I had my Bachelor of Science and Master of Science in Computer Science. And then during my Masters, I majored in AI and Databases and I graduated in 2009. So yeah, a long time ago. And at that time I had to say data science wasn't yet a thing.
Although I went through all the neural network, vision, and NLP types of projects that you might imagine. So I started to work in consultancy and then after a while I got bored of that so I wanted a fellowship in Paris and got my MBA. It was really interesting to get a business. I was a business educational background as well.
It actually was very useful for me to learn a lot about. How a company works and what's behind the product or the service that a company actually offers. And after that, I went back to Italy to work in Gartner Consulting. So again, in consultancy, it was just a, it was a little bit of a curse on me at the time. But then I moved finally in BMP Paribas and in, so as in, in the client-side, as in consultancy, they used to say, in a financial institution- retail mostly. So retail banking first, BMP in digital transformation, then HSBC. And then finally I moved into Gain AKA StoneX. Now it’s been acquired in 2020 and then rebranded into StoneX and I actually transitioned from a more retail banking type of service to your trading services. You might know or not know that StoneX owns the two brands in UK and worldwide FOREX.com & City Index. Yeah. Provides trading services to people.
The Key Ingredients for Launching a Data Science Function
Adel Nehme: That's really great. And I wanna set the stage for today's conversation. You let the Data Science team at StoneX. You led the data science function as well at StoneX.
It's always interesting to talk to someone who played a key role in launching a data team or practice within an organization. Cause I think there are many growing pain stories that are often missed by practitioners, and data scientists who join a relatively mature data team within an organization. So what are the key ingredients of launching a data science team or function within an organization?
Elettra Damaggio: So when I started in Stonex at the time I was still GAIN, it was 2019. I started as a principal analyst and the hope, for my boss, was for me to start the conversation about using data science and machine learning within a company that wasn't really using anything of this type of application. And I started with two analysts under me, and now I have 9.
So it was quite a journey in terms of, I have to say I was, I was very happy to see that the organization was ready to use the data in a certain way. And this is one of the key points. So data needs to be ready to consume. If you don't have that, you can definitely not start Data Science very quickly, cause, first of all, you need good data.
And I was lucky enough that all the other people in the other teams and in the enterprise data system teams and all of these people spend a lot of time and effort to set up a good data set, a good backend, from a data perspective that helped very much, this is definitely key. And I understand that sometimes in huge organizations where you have the so-called data, swamp issue, where a lot of people just dump their data in the cloud. And then they say, okay, now do something with that. That is really one of the biggest pain points of a data science practice. So that said, when you start a data science practice, the first thing that you need to understand is what can be your scope of action. And your scope of action is directly linked to the quality of data that you have within that scope.
So, in my case, I think the good, the secret ingredient of the recipe was to understand, okay, where can I bring value based on what is ready to be used. So not start from you cannot do top-down, cause if you do top-down and you say, oh, you know what, we should have a machine learning algorithm that do you know, X, Y, Z, and then you just, you fold this requirement into the tax back and you understand when you go to the tax back that “Wow! to do this, we would need to basically to work on all these data sets. And if all these data sets are in a huge mess, you will just spend months and months, if not years, to fix things.” So in this case, you need to be smart and understand “Okay! Where I can drive the most value with what I have?”. It’s like you open your fridge and you have, I don't know, eggs, and you have maybe an avocado and you have something “Okay What I can do with this instead of taking the recipe book and say, you know what? It would be great to do a carrot cake.” And then, you know, you said you don't have anything to do a carrot cake. That's basically the same thing. You just start with what you have and try to, and I would say, I know it doesn't sound maybe very fancy from a data perspective, but a lot of things, usually from a business perspective that brings value right away is good data or linked to data or integrated data view.
Technical Challenges of Building A Data Science Function
Adel Nehme: That's really awesome. And what I really would like to ascertain from some of your answers here is the challenges related to launching a new data science function. So you definitely mentioned the technical challenges of data quality. What are other categories or challenges associated with building a new data science function from scratch?
Elettra Damaggio: Definitely the talent recruitment. And also I would say understand the tech stack that you wanna work on. The way we did that was to incrementally find the use-cases that we know would, or we, we were fairly sure that we would provide value to the business, try to deliver a pilot of those, and then just get more money from the company, more investment. It wasn't “That's all the money. Go ahead!”. We had to earn all our tiny steps and we were fine with that because a big bang approach might not be the best because the point on machine learning I believe is that it's very much experiment-driven.
You need to understand everything that you can work with. You need to run all of your experiments. You need to understand, and the more you learn and the more you have an understanding. How many people do you need? What type of tech do you need? Maybe someone knows, but personally, if you just join a company and you don't know anything about, or you don't know anything yet about the status of the data, the status of the organization, the status of the business. Per se, I mean, in my case, even though I was in a financial institution, I was coming from retail banking. So trading was a new thing for me. So I had to learn a new type of service. So if it's a new type of service, a new industry, maybe not an industry, but a new area of industry. You need to get an understanding of that as well.
So data, organization, and business- understanding really deep inside your head. It's very hard to say, I need these people. I need this tech, I need this capacity. So the way you need to do that, in my opinion, is that you need to learn and readjust it. It's a lean start-up type of thinking. You just start with a pilot with your MVP and then you work on it and you just evolve and add on top and understand if you're still on the right track. If you're doing something that is useful for the business or not, and you constantly readjust and you add on top or you, you just like fine-tune. So this is definitely the way I'm doing it. And the way I would suggest. Someone else to do it.
And the challenge in doing this is definitely to find the right people not just in your team, but also it's really key for a data science team to have a very good dev team, an architecture team that can support you with suggesting the right tools for your need, suggesting the right architecture, suggesting everything that you need. For example, to process a stream of data. There are so many aspects to delivering a data science product that is really hard for one person to know everything of everything. So you need to make sure you have good people advising you in all the steps that you are not an expert on.
How to Build Trust as a Data Leader?
Adel Nehme: That's really great. And let's harp on that organizational challenge, whether it's building out your own team, can you walk me through in more detail- How do you earn trust as a new data leader within an organization when working with different stakeholders, such as the dev team, such as the business stakeholders, right? And that's the first set of questions, but the second set of questions here as well would be, how do you build out a team knowing it's still early out in its juncture and you wanna be relatively disciplined and the type of resources and the number of resources you add to a new team while maintaining the fact that you're adding value. But you also wanna make sure that you have the best hires. So what is the type of profile you look for in an early data team?
Elettra Damaggio: As I said, those success stories that you can drive in your first 6-12 months, Those are keys for you to build your trust. If you can deliver a success stories let's say, within your first year in the business, it could be something that. People can say, “Oh! You know what? Who's that person that delivered that? And they can associate you with a certain type of deliverable. So start to build this type of trust by actually having direct content and being how they say lead by competence. So make sure that everyone has things associate your name with something that works. That is definitely step one. And then you start from that. And I would say if you can secure that, it will be all good. Everything will fall a lot smoother instead compared to something I just barge in and say, oh, we should do this. We should do that. And so forth.
And on the second point instead what type of people you should be hiring in your early data team? So, because the Tech stack was very simple at the beginning, like very, very simple, because we were building the practice, let's say, and that is also related to the iterative approach. If you start with a very complex text stack, you know? Oh, very, a full, a full-tech stack from your cloud, your machine learning and ops platform, your data engineering, ETL, and all the works. Okay. All the works you have GCP or AWS or Azure cloud, and you have on top your email. Of course, you need people that are skilled in all this tech to deliver something so automatically you will automatically need more people cuz you will need, you cannot have someone that knows everything about all this tech.
If you start with an easier tech stack, right? We started with Python having a server that was running our patent script to test in London. But let's say partnered with another dev team to deliver some models in production. So we didn't do the delivery in production, but we handed that over to other dev teams that had another tech stack. So with that in mind, the type of people I hired in the first place where I would say, uh, data scientist, that had a little bit of coding, if not coding experience, just coding appetite. So they didn't mind setting up patent scripts that were just getting data from API scraping websites or whatever, to get the data that they need to have to develop their machine learning models or to just test and experiment with the machine learning models that we have in mind.
And once we develop a couple, two I would say, a couple of success stories. We finally started to have our own development platform. We have been completely included in the DevOps process because when I started the analytics team, wasn't considered part of the DevOps. It was an old-school Excel BI type of team. And that was all it was reporting. Most times it was just reporting, but of course, there was an appetite to evolve that. So we started with that the in 2020, they say, you know what, guys, you are developing software. It's good that you are included in our DevOps. So we started to be included in the DevOps.
So we had some training. I already knew a little bit of GIT and BitBucket or BitLab or whatever. We switched repositories in between, but the other guys were the type of guys that were so eager to learn. Definitely, they need to have a solid foundation from a statistical and mathematical perspective, but they need to have that I would say- that appetite to develop things, to not just analyze things, but to really develop something that is a product. So it's more, it's more a type of thing that you need to associate to a strong quantitative background. That was the type of person that I was hiring at the beginning.
Evolution of Hiring Practices for Data Scientists
Adel Nehme: That's really great. And how have your hiring practices or what you look for evolved as the team grew and it became more established and provided ROI?
Elettra Damaggio: So now that the team is a little bit more established, the way I set up my team is that I have guys that are more focused on the data engineering and machine learning, actually engineering things, as we are setting up finally, our MLops tech stack.
So I don't know if this is like a very mean differentiation, but the way I see this is that you. There are people that are driven to write what someone might call production code. They're like other people that are more driven to analyze experiments and see things like what, how I see the data scientist at the moment in my team is very much an R and D function.
So it's a person that needs to have business acumen, so needs to know about the business or needs to be able to understand the business. So has a strong commercial organizational and business understanding. And of course has that statistical and machine learning knowledge so that can, you know, just join the dots and say, oh, you know what? I can use this data to solve this problem. But once I've said the data scientist molded the infinite space of solution and caged it in a little bit more manageable space. That thing is passed on the machine learning engineering and the engineering function that will industrialize and set up the pipelines and everything that needs to be done in order to operationalize and make of that mold, a product that is reliable, sustainable, and reusable within the business.
On, top of these two groups of people that I have in my team, I also have a BA. Supports me and the way a BA and which I think is really useful because BA is that type of person that first of all has a constant relationship with different stakeholders and that are customers of their products and can gather requirements. And have a conversation with the data scientist or the machine learning engineer to say, you know what, maybe we should do something to either change the product in this way, an existing product, or maybe design something new that would include that would solve this type of issue. And also is the person that really helps you embed the product within the business, you know, training business stakeholders, talk with them, maybe guide them at the beginning on how to use and how to interpret data and how to interpret the model workings. Because one of the things that when you develop a machine learning model is that it's very hard to explain these to non-data people.
So you need to have that person that has that constant relationship with him so he can, or she can like wrap that up in a way that is understandable. And so that you can have sponsors outside your team. That's key. You always need to have sponsors outside your team.
Maneuvering obstacles like “Lack of Data Culture”
Adel Nehme: I love that answer. And I love how you create the delineation within the data science team from a more research and development type, a small mini data team that transitions its outputs, and a more applied engineering team that industrializes the work a lot of data scientists do. But harping on that last note here, when it comes to the business analyst role and one creating a relationship with other stakeholders and gathering requirements and feedback, oftentimes when talking to data leaders, a big obstacle they face when it comes to providing value with data science, analytics is data culture, or analytics mindset, or lack thereof within the organization.
I'd love to understand from you, how you approach conversations with the remainder of the stakeholders within the organization that may or may not have a mindset or a data culture, or understand the value of data science and how you were able to maneuver these obstacles, whether through the use of a BA or within your own team and how you approach these conversations?
Elettra Damaggio: So, first of all, this has nothing to do with your data skills. Then just put this as a, like a disclaimer on top, this is all about, uh, your. I would say political skills or relationship skills. So, as I said, it's key for you to start understanding where you can find your sponsors. So, first of all, you need to have a conversation. For example, in our case, our company is organized by commercial leaders and we have global teams as well. And commercial leaders, of course, you have commercial leaders of the biggest regions and commercial leaders of maybe smaller regions. And you need to gather an understanding of who has the most driving role within the community of executives. And I'm sure if there is a data science team in the company, you will be able to find your sponsors from day one, the ones that are really keen to get involved in that. It might be easier or harder in some cases. So, first thing, try to understand what are your easiest sponsors, the ones that may be the keenest in sponsoring you. But they might be still on the lookout because you haven't delivered anything yet. I'm interested in data. I will like data science. So try to understand what are their key requirements. And as I said, I remember when I started, I was like, this is a little bit of your Jedi trick.
“So you don't want that. You want this.” So when you have a conversation with them and you know what you can deliver, you need to in a clever way, sell something that is useful for them, but you can deliver in, in a reasonable amount of time. So you try to drive them to that type of solution. And these are your personal negotiating skills.
Once you have secured your good sponsor, your big sponsor with that, you just work them one by one. That makes sense. And this, I know this might seem okay, but what happens when I deliver the model, I have to explain that to them. This is not related, right? It's actually very related because if you know that they're already sponsoring you, the day you are going to them explaining they will have a different attitude listening to you. So you will have your chance to explain it to them. And I would say don't be condescending, never being the lecturer there. Always try to, you know what, I delivered this because of the main goal for this. To provide this additional benefit for you. I'm using this. Do you want me to go through the details of the model? I can. Most of the time I have to say that we're interested in knowing the performances. So whatever type of performance metrics you wanna use. Try to save for the business stakeholders, the ones that are most understandable, all the performance KPIs that you used to understand. If the model is sustainable, if the model is robust, maybe just save it for the analyst.
But at the end of today, commercial stakeholders want to know how often this works and if it doesn't work, what is the risk? So, for example, we had a churn prediction model that we started to share. It was our first XG boost, random forest, actual real machine learning type of model. And we tried just to, we went through that, just explaining the features. And we explained the confusion metrics to the commercial leader. And that was already too much because it was a new thing for them. And the way we were talking to them about that was the model on average, 90% of the time predicts correctly. But what the mistakes, we, we worked in a way that. We are over-predicting churns because, at the end of the day, we slightly overpredict churns.
This is why we don't have higher performances because, at the end of the day, it doesn't cost us a lot to send another email or call another person that is at risk of churn. It might cost more losing someone that we're not calling and wrapping in this way. It was very understandable for them. And they were really happy with that. It required multiple explanations, like multiple times to go through. But after that, you just build trust and it's easier and easier because they just start trusting you. And they say, okay, you know, I don't have a full understanding, but if you say it's working, it's fine. “we'll see, we'll review it after a couple of months that we have this running.” So this is the type of, I would say, massage. That you have to do at the beginning and you need to be patient and not rush or be aggressive. Definitely not aggressive.
Aligning the North Star with Short term Goals
Adel Nehme: That's really awesome. And I think at the crux of a lot of the different answers that you've done so far, and I think a key central tenant when it comes to succeeding and launching a data team is managing both the short-term priorities and the short-term wins that you can get. But as well as making sure that you're working towards a long-term vision. So there's always a north star where we wanna be in the long-term and quarterly OKRs and objectives that guide the short-term objectives for a data team.
Can you walk me through the process of prioritization between these two objectives?
Elettra Damaggio: I have to say, it's not something that you do alone, especially if you're joining a new business. The first thing that you wanna do is also have a talk with the people that have been a long time in the business. So they can share it with you. What, I've been 20 years in the business or 15 years in the business. And I think one of the things that really would disrupt us will be a way to predict that, to understand that, and then is okay. Wow. And never underestimate the fact that if the guy has been there for 15 years and they didn't manage to do that, it, it doesn't mean that because you're a data scientist in one year, you're gonna do that just because you have machine learning or whatever, it's probably harder than that.
So you just put that and you gather all of these thoughts and you understand, okay. You know what, so let's define a roadmap to go there. So for example, one of the things that we gathered from our, I would say key internal speakers are applications that we can apply to the online stream of trades and transactions. And of course, being able to apply a machine learning model to an online stream of data. It's something that requires a tech stack that we're building towards. But if we started to do that from day one, we wouldn't have delivered anything valuable. It will just be a cost for the business. And we'll probably still be working on it after three years. Because it requires time to do that. So you have that. And so that's your top-down checklist if you wish. And this allows you to understand what is the road roadmap? So what we have now, okay. Now I have my desktop, a SQL data warehouse, and Excel because that's how we started. And I, I need to go where? Machine learning an online stream. Set up this, what do I need to do that? And you can do it yourself.
I would always advise to talk with other people as well on the architecture side and gather like their view because I'm sure other people would have thought about it as well. And you start defining your roadmap and milestones. We would need to have, at least an orchestrator, like airflow to run our scripts and Python and all of these things, we would need to have a DevOps process and that is step one. And then you go, okay, you know what? We will need to have probably a cloud-based approach to run our machine learning, not on our desktop, on a cloud, a computer that is scalable and we don't need to leave off our laptop to run overnight to train models. We will have something on the cloud to do that. And have some platform to connect to different data sources. For example, I don't know Data Bricks or, this type of over Azure Cloud and all, all of this platform. And then to actually get a stream of data, you will need something like Kafka, and then you start using PI park and all of these things.
So you have this. Plan. And this is your vision planning that is always easy from a certain perspective. You just, you plan and you say, okay, what do I need? I need all of these things is your grocery list. On the other side, you have short-term. And the short term, as I said before, you need to start with what you have. So what do I have this, what can I do with this? And what is the priority for the business priority you get from your sponsors or commercial stakeholders? So you get the priorities from that, from the business. When I joined, I got two priorities. We need to understand how much we're spending on acquisition marketing and how much we're getting from that spending. Because at the moment we have no idea. So that was one priority.
And on the other side, like we dunno how we're targeting our customers. We need a way to, segment our customers and define the journeys based on our segments. So very much acquisition focused. And I have to say having an MBA or whatever, like the business course or marketing course that you can have really helped me there because I know how a marketer would think about these things.
Defining persona as defining the user journey is defining all of these things. This is the knowledge that I got. From both my MBA and also my previous job in retail banking, because I used to work in the UX team as a quantitative BA I was analyzing data and defining journey with the user experience designers.
So I knew how much the acquisition of artifacts was important for designers and for marketing in general. So, thanks to that. I was able to capture that, but I have to say they were very vocal that they had these issues. I said, okay, well, can we do that? And I had a look at our Dataware and as I said, in the beginning, it was really key. I was very lucky to have a neat data warehouse, even if it was just our backhand backend on-premises, data warehouse. Our onboarding system and customer activity. I was very lucky to have a very neat data set to start work with. Of course, there were some glitches in the process, but nothing too messy.
So, that was key to the first successes that we had. So that's how I started to prioritize more short-term goals.
Framework for Early Wins for your Data Team
Adel Nehme: That's really great. And if you wanna abstract this out and propose a framework that can enable other data leaders to extract small wins, as well as low-hanging fruit that demonstrate early value for a data team, how would you go about that?
Elettra Damaggio: So the way we go about that, I would start with the data. Don't do that alone. Start with your business stakeholders. And ask them, what are the data sets that you use in your day-by-day job and how do you use them? Because if, how they use them, you can understand, oh, you know what? You could automate that I could do something that will help you in using that data in a more efficient way.
And by going this way, you're able to, first of all, understand right away, the data sources involved in the process and have a look if the data sources are usable and second, you have your use case. And even if it's not the fanciest use case, you can start delivering something very quickly because you have a workable data source, to start with.
And by doing that, you start building your sponsors. And once you start building your sponsors, even if it's like with tiny deliverables, you can start building up things on the other side. I would say, based on how messy is the data situation in the company, you can start to involve other teams and raise awareness. If it's not there ready, maybe it's already their awareness, but raised awareness and investment of time and resources on fixing the data so that it, the data will enable you to produce something that is more of higher value. This is the way I would do that. As I said, is a very entrepreneurial, lean startup type of approach MVP first. And then you just build up. Your way to the top.
Value of Data Science in Trading
Adel Nehme: I couldn't agree more that bias to action and having that lean approach is super useful for a lot of data teams. Now, as we end up our episode Elettra, I'd be remiss not to talk a bit about your work at StoneX, especially on the data science use cases that provide value in financial services. And with the recent war in Ukraine, COVID use supply chain issues, and economic uncertainty I think it's never been more important from a data scientist's perspective to understand the role data science plays in commodities trading, foreign exchange trading, and more so. I'd love to understand some of the ways data science has been providing value in the industry.
Elettra Damaggio: I have to say we haven't been requested. And in our case, the overall international situation. They don't affect too much directly. Of course, we know there are like some people that have been sanctioned. So accounts have been blocked StoneX didn't have a huge impact on this point. So we've been lucky, but as a trading company, we of course experience a lot of volatility in the market and that made our business very active from a certain perspective in terms of how data team.
In our case, we haven't been involved too much apart from making sure that what we were seeing in our system wasn't affecting other processes in the business. But in, in terms of doing anything, we haven't done anything also because when you have this, I would say delicate situation, it's left to human handling because you never know if you automate, things you are prone to- I would say embarrassing mistakes. And this is something that, of course, no company wants because of likely I mean because we have a manageable volume of customers, a manageable volume of accounts, the data team wasn't really involved in doing anything specifically.
Applications & Developments in Data Science in the Field
Adel Nehme: What are some of the main use cases you've been working on as a Data Leader at StoneX?
Elettra Damaggio: So we have definitely a lot of things related to marketing. So segmentation, attribution modeling, churn prediction, and lifetime value prediction. Last year, we did our first NLP application to classify customer communications. At the moment. We're also working on client sentiment in trading. and definitely one of the things that we would like to work on, as I said before is online streaming of data, but I don't have yet workable use cases to share, we need to build the ground to do that.
Adel Nehme: That's awesome. So as we close up our episode, I'd love to look at any future trends and innovations that you're particularly excited about?
Elettra Damaggio: At the moment, I feel that we are achieving a sort of.. Data Science has been a very wild type of area- there was a lot of bias, not many companies achieved to detangled the data science practice.
So at the moment, the focus that I have is to try and industrialize the approach and make the data science practice solid. So the type of, for example, tech that we're looking around is definitely ML ops and pipeline tech. In terms of like pure innovation and machine learning. Honestly, there's nothing purely innovative that we're looking for.
We have so much ground to recover and to work on it before we do something more innovative, especially for the market. There is a lot of innovation in terms of combining multiple models. So, ensembling, for example, but also combining multiple models to dynamically select advertisements. This is something that is in our mind and we will definitely do that.
So using internal and external data to understand what are the trends, what are the things that are actually grasping people's minds at the moment, and dynamically selecting the content of your advertisement, serving them at the right time to the right person. That is definitely something that is becoming machine learning heavy, especially with all the cookie policies that are becoming more and more strict.
So this is definitely something that is in my mind. I don't know when I will be able to implement it. But this is definitely one of the things in my mind.
Call To Action
Adel Nehme: That's awesome. Finally, Elettra, as we close up, do you have any call to action before we wrap up today?
Elettra Damaggio: I would say just, that it takes patience and hard work. So if you're not ready to have patience and do you know your hours to, to get your success stories- do something else.
But it gives you a lot of satisfaction, but it definitely gives you a lot of satisfaction in the end. It's worth your while, but it's a hard way to the top- If you wanna rock and roll, as they say, yeah.
Adel Nehme: A hundred percent. Thank you so much Elettra for coming on the podcast.
Elettra Damaggio: Thank you for having me.