Post-Deployment Data Science

Hakim Elakhrass talks about post-deployment data science, the real-world use cases for tools like NannyML, the potentially catastrophic effects of unmonitored models in production and the most important skills for modern data scientists to cultivate.

Aug 7, 2022

Transcript

Guest

Hakim Elakhrass

Host

Adel Nehme

Key Quotes

Discrimination and bias in a model is unethical and the impact can be catastrophic to a business. Unfortunately, this can simply be that when you built your model that you didn't see bias in certain demographics because you didn't have enough of them in your data. Over time, more and more of a certain demographic enters your data that the model can't properly make good decisions for. That is extremely detrimental from a financial and business perspective, because if your model is discriminating against a certain segment, then you're obviously not doing the best for the company. Worst of all, it’s not fair to the people you're making predictions about.

Actually putting models into production is what will set you apart as a data scientist. it's an important skill that, unfortunately, not many data scientists have. They should also really have a grasp of the business impact of the model. A model is more than just its performance or technical metrics. Why are you building this model? What value does it add and how is that value changing over time? How is it impacting other departments? Obviously data scientists need technical skills, but they must also have a deep intuition about why they are doing what they are doing.

Key Takeaways

Whether or not you know what actually happens in the real world after the prediction, understanding model performance is still challenging from both an engineering perspective and a data science perspective.

Data scientists need to cultivate a thorough understanding of a model’s potential business impacts as well as the technical metrics of the model.

Making machine learning tools open source builds trust with users and enables a community-based approach for getting feedback.

Transcript

Adel Nehme: Hello everyone. This is Adele data science, educator, and evangelist at data camp. Last week on the show, we had Serg Masis on the podcast to talk about interpretable machine learning. Throughout that episode, we discussed the risks plaguing machine learning models and production.

And as these risks grow, what are the tools of the disposal of practitioners to iterate and understand model performance post? This is why I'm so excited to have Hakim Elakhrass on today's episode. Hakim is the co-founder and CEO of NannyML, an open-source Python library that allows data scientists to estimate post-deployment model performance, detect data drift and link data, drift alerts back to model performance changes.

Throughout the episode, we spoke about the challenges in post-deployment data science, how models can fail in production, some cautionary tales to avoid why NannyML is open source, the future of AI, and much, much more. If you enjoyed this episode, make sure to rate and comment, but only if you enjoyed it now on today's episode.

Hakeem. It's great to have you on the show.

Hakim Elakhrass: Yeah. Thanks a lot for having me. I really appreciate it at all.

Adel Nehme: I'm excited to speak with you about post-deployment data science, your work, leading NannyML, and so much more, but before, can you give us a bit of a background about yourself and what got you here? Yeah, sure.

Hakim Elakhrass: So I'm originally American born and raised in New Jersey. My educational background is in biology. So actually quite far from dat... See more

a science originally, I was mostly working on evolutionary biology and population genetics. And at the last year of my bachelor's, I did a course in bioinformatics and that was kind of an R and using like a bunch of machine learning techniques for population genetic purposes and also genetics, just genetics in general and stuff like that.

And that like really hooked. Through the concept of what you can do using programming and machine learning. And originally I actually wanted to be a doctor and my idea was like using personalized medicine. That was what I was super passionate about. So I was like, oh man, using machine learning and genetics, you can get to a point where you can really on a personal level prescribed certain treatments and like really help people like at scale, but personalized.

And that was something I was super passionate about. And so then I was like, okay before I go to medical school, I wanna go do a masters of bioinformatics. And I ended up moving to Belgium to do the Masters at K Leuven, which is it's funny because I know DataCamp is headquartered in Leuven. So it was it's nice.

It's a nice city. And basically, my motivation behind that was pretty funny. I wanted the highest cheapest master's program that existed for bioinformatics. And you end up at the K Leuven. It's like rank 29 in the world and it was 600 euros a year. But then I really got hooked onto the data science machine learning side.

And I abandoned my dreams of becoming a doctor and decided to just go full force into kinda machine learning. And then I worked bit as a data engineer, a data scientist. I started machine learning consultancy also with my co-founders from NL. And then eventually we just saw that every time we put models into production, we always got this question about, OK, what happens.

How can we trust the models? How do we know that they're performing well? And at the time there were a lot of smart teams working on what we call the lops part of it. So the infrastructure behind it, how do we actually deploy a model like the serving and things like that? And so we decided like, okay, given our expertise, which is data science and the algorithms, not really the actual programming and being good software engineers, that's how we decided to work on NannyML and we also obviously thought it was super important.

Challenges in Post-Deployment Data

Adel Nehme: So I am very excited to discuss your work leading NL with you, but before, I'd love to anchor today's discussion and some of the problems you're trying to solve. So, you know, over the past year, we've had quite a few folks on the podcast discuss the importance of ML ops and the different challenges associated with deploying machine learning models at scale.

However, another key aspect of ML ops is post-deployment data science work. As in monitoring, evaluating testing, machine learning models, and production, I would also add improving and understanding business impact here. I'd love to start off in your own words. Walk us through the main challenges in post-deployment data.

Hakim Elakhrass: Yeah, sure. And maybe just a little side note on why we like using the word post-deployment data science instead of just machine learning monitoring, like I don't wanna come off as someone like randomly trying to create something new. so is a. Passive activity. And if you describe NML, what it does today, it's indeed a monitoring library, but we see the work so much more than just monitoring. Like, yes, this, the first step of post deployment data science is actually monitoring and knowing how your models are performing. But then, um, we see that there's a whole host of other things that you have to do.

That's why we like to term post-deployment data science, because we feel like it is actually real data science and will be the responsibility of someone. More data science skills. And, maybe to go into the challenges of post deployment data science? I think the first and foremost is knowing the model performance in the first place.

It's not so trivial, whether you have ground truth or not. So whether, you know, after the prediction will actually happens in the real world, which is ground truth, or you don't have it knowing your model performance, it's still pretty challenging from an engineering perspective and from a data science perspective.

And so just having that kinda visibility is already. Pretty hard and maybe to give an example of, okay, when you have ground truth, it's more of an engineering problem. So when can we get our ground and compare to what our model actually predicted, but when you Don's data science and algorithm problem. So for instance, credit scoring.

Where you have a machine learning model that decides if someone should get a loan or not, and the model predicts yes or no, this person should get a loan. And then when do you know when that prediction was correct or not correct? Either the person has to pay back the loan or they don't pay back the loan, but either way it's very far in the future.

So you cannot. Calculate the performance of your model in the traditional sense. And then the second challenge I would say is models fail silently. That's, you know, one of our taglines and basically the problem with machine learning models is that if you give it data in the right format, it'll make a prediction.

Whether that prediction is right or that that's not the model's problem. And so, yeah, software for the most part can fail loudly. So like you have a bug and error, it doesn't run. So you know what? It's not. But with machine learning models, you actually dunno when it's not working. And so that silent failure is really problematic.

And then a third challenge I would say is that most data drift is virtual. So we didn't get into a nitty-gritty technical details yet. But basically data drift is when you have a change in distribution of your input variables to your models. And most of it is virtual. And what does that mean is that it doesn't actually impact the performance of your models.

You have data drifting all the time, but your model can actually handle. So, if you were just detecting, when the data changes, you're gonna get a lot of UN actionable noise and you won't actually be able to do anything with that information. And then I would say finally, which is the most complicated one.

And I would say. Probably a lot of data scientists still don't have that much experience with it is the feedback loops. So you have the relationship between the technical metrics and your models. They might change. But in general, just having a lot of machine learning use cases where you are taking like predictions on a customer base, for instance, and you have a model that takes prediction on a customer and then.

A different department does something like imagine a churn model that decides who will cancel a subscription or not, and you predict someone will cancel their subscription. And then the retention department sends them a discount. And then the next month, or whenever your model's running again, you make a prediction on that save customer again.

So then the model is actually impacting the business and the business is impacting the model and you have these like very interesting feedback loops where the model performance will definitely change. And. You can have things where like, when you were building your model, you run a bunch of experiments and that you had the business metric of keeping churn below 5% and to achieve that you needed a AUC for instance, of seven, but then over time, actually the model performance has to go up, uh, maybe touc to achieve the same business results, because maybe in the beginning you were able to detect the people who will churn easily and you took the first steps to stop them from churn.

And then the people that are later gonna churn, they become harder or. Weird things can start to happen. So there's definitely lots of challenges still. Once the model's put into production,

Adel Nehme: I love the holistic list. And I definitely agree with you that this is definitely a problem that is foundational to the industry.

If we're gonna be able to really derive value from data sciences scale, you mentioned here, the models fail silently component, right? Can you walk me through the different ways machine learning models fail silently and. Yeah.

Hakim Elakhrass: Sure. So, yeah, I already mentioned that most models fail silently. I would say there are two main ways.

Data drift, induced failure, and concept drift induced failure. So basically, data drift-induced failure is when the input data to the model has changed to the point where the model has not seen enough data in the new distribution to make good predictions. So for example, the average age of your customers was 25 and now it's 50.

So the age, the individual feature of age has shifted in distribution. And maybe your model, hasn't seen enough 50 year old customers to make good decisions there. But again, it'll just keep predicting on that as if nothing happened. And so that's the silent part. The second part to it is the concept drift Inus failure.

And this is a change between the relationship of the input variables and the output machine learning model is basically just trying to find a function. Maps inputs to outputs. That's basically what you're trying to approximate it as best as possible to the real mapping function that exists somewhere, you know, the ethereal space of reality so that can be caused by like the actual behavior of the underlying system change most often by a variable that's not included within your model.

So for instance, like your actual customer behavior has changed. So maybe now your 25 year old customers are buying cheaper products because the economic conditions have changed and you don't capture economic conditions in your model and sometimes concept drift, induces data drift. So for instance, in this case, you would see the average price of the products your customers are buying decrease in time, but sometimes concept drift can also be silent where it actually doesn't impact any of the data in your model.

But something did change in the real world that you're not capturing. And so the fundamental behavior of the system is different. And then the performance from that can suffer. And both of these can have either, um, catastrophic failure or gradual degradation of performance.

Consequences of Badly Monitored AI/ML Systems

Adel Nehme: I love that distinction that you make between concept drift and data drift and harping on the catastrophic failure or graduate degradation of performance.

You know, a lot of the problems that you discuss here have those consequences that can range from harmful to say the least right to, as you said, catastrophic for an organization using machine learning and AI. Can you walk us through this range of consequences for badly monitored machine learning and AI systems?

Hakim Elakhrass: I mean, I guess the first thing can be nothing depending on how impactful your use case is. That's the funny thing in this kind of space is that. Model is only as valuable as the underlying business problem at the end of the day. Right? So if it's a model handling some fringe cases or something in your company that doesn't generate a lot of value, then if the performance changes, maybe nobody will care, right?

Or maybe depending on your processes in your company, the model doesn't actually do anything by itself. Like maybe it outputs data frames, and then they're imported into an Excel and then not shared with business and then business looks at the results. And then there's actually no automated process in there.

It could be that it's less important to monitor. And I would say that's the lowest thing. Then you have like gradual. So yeah, maybe just one thing like monitoring becomes essential, like really essential when it's, when your models like mostly systems. Right. It's important before that you business you on top of it's like in a churn system where if a model predicts, someone is churning and then an automatic email campaign goes out to give them a discount or whatever.

However, your company wants to that's what. Monitoring is really, really important. And so casual degradation is. Over time, the data is drifting a bit and your model just becomes less and less performance. That can be a little bit like the feedback loop thing that I mentioned before, where over time you're already identifying the people who are more likely to churn and you're getting them out or like you're stopping them from churning.

And so then the people over time just become harder to detect. Right. And things like that. And your model just. Worse and worse, but it's not anything catastrophic and maybe it causes a 10% loss in performance. And again, this O all depends on the underlying use case, right? In some use cases might be like, whatever in other might be like 50 or it's really, depending on business use, you're working. And then you have. Catastrophic failure, which is a lot worse. We've seen a few of those in the data side space. I always point out Zillow where they basically systematically overpriced 7,000 houses by 300, and then it just collapsed. And actually, their market cap dropped by 30 billion. So that's. Yeah.

Yeah. And they fired the entire comp, like they, they shut down the division that was buying and selling houses and fired everybody. So like a catastrophic failure and also maybe a non-financial catastrophic failure could be like the chat bot day from Microsoft. I don't know if you remember that. The one that was pinging out a lot of harmful content that was trained on Reddit content.

Yeah. It got racist and terrible, real fast. So these are kinda the catastrophic failure where maybe if you're introducing some systematic risk that you don't realize, and then all at once, it just collapses and lots of bad things happen. And then finally, I would say there's discrimination and bias.

And the main impact from that is, yeah, not moral. It can be pretty bad PR. It's just not good. And it can be that when you built your model, you didn't see bias in certain demographics because you didn't have enough of them in your data. And then maybe over time, more and more of a certain demographic enters your data and the model can't take good decisions on them.

And that would obviously add also from a financial perspective, because if you can't take good decisions on a certain segment, you're obviously not doing the best for the company, but it's also just not fair to the people that you're taking predictions on. Obviously, whatever it is that impacts them would be discriminatory and not doing the best it can do.

So, I would say those are the impact of what can also happen.

Zillow Case Study

Adel Nehme: That's great. And do you mind expanding maybe into that Zillow case study? Just for a bit, because I think this is a fascinating case study for data scientists deploying machine learning in the wild, especially once data science becomes foundational to the company's business model.

So do you want to expand maybe on how that failure happened, as well as the underlying issues that led to that catastrophic?

Hakim Elakhrass: So that's a good question. We could only speculate because we actually don't know. And also, some people claim it's not a data science problem, so you can like postulate, like how can that happen?

So the thing that's interesting with, for instance, house price prediction is essentially what they were doing. If I understand correctly, they were like having a machine learning model that decided the price that they should buy a house. And so, you essentially predict the price of that house. The problem with that is you don't have any ground truth because the model predicts the price you buy the house.

So the prediction becomes a reality. So you don't know the model's performance in the real world, probably when they were building the model, they had a bunch of house prices and they tried to predict, and then they measured the performance like you would in whatever kind of data science system. But once you put it out in the real world, there is no real price.

Because the real price is what the model makes the price. And so you can see how, if you cannot calculate that performance and you're introducing these little systematic errors over time that push the house price higher and higher for whatever reason. Yeah. Eventually, you can realize that you have a huge portfolio of very overpriced houses.That makes for a huge problem.

Adel Nehme: That's incredible. And it's fascinating, especially when you mention how the prediction becomes ground truth here. There's this feedback loop, as you mentioned, where the machine learning model becomes a reality, and companies can't escape that to a certain extent without really proper modeling.

So in contrast to feel like software engineering, it strikes me as an interesting aspect of data science that post-deployment work. You know, MLOps in general is. To this day, not have been codified and not has yet matured around a set of best practices, tools and rituals thought and data science. Why do you think this is the case?

Hakim Elakhrass: I would say the main thing is that it's still early days. If you look at data science as a field, of course, it's all like, I don't wanna get to any statisticians mad because if I say data science is only 15 years old, I'll get a hoard of very angry statisticians. That will be like, yeah, I was doing machine learning in the seventies somewhere.

Yeah. We know, we know, but data science is a field in the non-financial industry. Let's just say that that's relatively new. And so I think that there's a big learning curve and it also comes with the concept of risk because you have these financial institutions who have been doing this modeling for the past 70 years, I think maybe even longer.

And they developed all of these processes around handling the risk that comes with taking decisions based on mathematical modeling, they have entire risk for departments validation. They have this whole process, the government even regulates it, right. If you wanna do a model for a credit scoring the government.

Benchmark models that they could your results on, like its extremely regulated and well known how to handle that. And I think the problem is when you have, for instance, like a grocery store or a media company who doesn't have a risk department. They don't have any inherent understanding of risk, starting to make decisions based on models.

And like really can really impact their company strategy. Right? Cause if you think of like a churn model, if you see a lot of customers are churning, you might say, oh man, we have to change how we're doing our marketing. Or if you could make big decisions based on what is coming out of your machine learning systems.

But I think it's mostly chalked down to being the early days and people not having a lot of experience in general. So with these things, To be honest, as much as we data scientists love to feel that data science is already everywhere. There are not that many models in production. Unfortunately, it's still very early like.

You know, what I like about being a monitoring company or nanny in general is that it's a very good litmus test for whether a company actually has bottle in production. cause sometimes I feel it's like, yeah, we have in production we're data science and oh, cool. So you use nanny or another monitoring library and it's like, oh no, we're ready for that.

The models are not actually in production. Yeah. It's just, it's kinda.

Adel Nehme: The latest survey I found is that 10 to 20% of models make it to production. Only that companies evolved. So they definitely agree with you.

Hakim Elakhrass: Yeah. So it's just wild west. And I think that's normal that you see in the early days, like this kind of a whole mess of different practices and hundreds of tools and that over time as it matures, I think it becomes much clearer on what the best practices are and how things should kinda be.

What is NannyML?

Adel Nehme: So I think this is a great segue to discuss how an NML aims solve a lot of these challenges. So can you walk us through whatNannyMLis and how it works?

Hakim Elakhrass: Yeah, sure. So NannyML as this day is an open-source Python library. So data scientists can just rip install it and basically use it to detect silent model failure.

Right now, the way we see most data scientists usingNannyMLis like in a notebook, they run an analysis on their models and production, the data they have, but you can also then of course deploy it. It can be ized. It can be, you know, deployed however you want and have it monitoring in, in the traditional sense.

Near real time or batches, whatever you wanna call it. But nanny in general, it has three main components, performance estimating and monitoring data, drift detection and intelligent alerting. And so basically with performance estimation, that's the whole reason we're doing nanny. We spent a lot of time researching to find some sort of methodology that would allow you to estimate the performance of your models in the absence of ground.

So instead of waiting a year for your credit scores, to know if someone defaulted or not, or paid back the loan that you can in the middle, estimate the performance your model would have on the current data in production. And that was quite hard and lots of research, but it works pretty well now.

I'm probably not the best person at NannyML to go into these gory details. Like I, I'm a data scientist, I've the research guy, but I can, I'll try as best as possible to explain it. And then my data scientist will all yell at me and say, I'm stupid. No, just kidding. But basically, my understanding of it is that at least in classification models, you're model it outputs two things, a predict.

And like a confidence score, basically. So you basically know if you have a class zero or one, and how confident your model would be about that score. Imagine you have a prediction one and you have a confidence score like nine. You can say that the model is correct for 90% of the observations where it predicts one, and you can use some magic from there to reconstruct an estimated confusion matrix.

And from there, it seems that it captures. Changes in performance that are due to data drift actually. And from the estimated confusion matrix, you can just get an estimated rocket you see, or an estimated F1, or like any machine learning metric and it's estimated without the ground. And it seems. If there is a change in performance and it's due to data, we know that the performance has changed and we know by how much it has changed, you get the actual performance metric

consistently.

And then the hard part is concept drift, of course, which so we cannot capture change and performance to the concept drift quite yet, but it's something that we're quite researching quite well. And also probably that explanation was not extremely coherent. So please go read ours. You'll get a better explanation than just performance.

Yeah. And then the data drift, that's more run of a mill. There are two parts to it. We have like a univer drift detection. So that's basically when the distribution of individual variable changes, so like age. Then you have multi drift detection where an email gets a bit fancy again, where we developed our own algorithm for detecting multi drift.

So that's basically detecting drift in the data set as a whole and the relationship between variables and there, it does something basical. With a PCA reconstruction error, where you basically do a PCA on a reference period, and then you do PCA on an analysis period, and then you can compare how different those PCAs are and you get an error and that error will tell you how different the data in your reference is from your analysis.

So then we basically also alert you when changes in performance happen. And try to point you in the direction of the data drift that has potentially caused those changes. Right. So a bit more actionable and try, basically I don't, it's not caused, so I can't say caused, we don't do causal machine learning just yet, but basically data drift that happened, uh, at the same time that your model performance has changed.

So more correlated. Yeah. And that's it.

Adel Nehme: Are there any examples ofNannyMLbeing used in production?

Hakim Elakhrass: So since we went open source, we've seen, uh, and this was our hunch all along is that the performance estimation would be particularly useful in like credit scoring. And so we've identified a bunch of users in the financial industry usingNannyMLfor credit scoring.

And when people ask us about NML, it's often from that financial industry and credit scoring. Uh, but that's, that kinda goes through. Importance of the underlying use case and how obviously you don't wanna give loads to people who should be getting loads. And the financial incentive there is very high.

So if we can help reduce the error there, then obviously use. One of the sad things about being open source is that it's relatively hard to know who's using your software and what they're using it for. We try to identify people using it. We try to talk to who we would define as our ideal user and see who's using it.

Who's not using it. How are they using it? But in general, it's pretty hard to identify them. And. Basically, because we're still, obviously everyone should be doing this, but in the early days, it's super important to work closely with users as well. So we have a series of design partners where we work together to deploy an NML, iterate on the library and things like that.

And there it's really varied, like from churn prediction to demand forecasting and things of that nature. So it's really all over the place there.

Why is NannyML Opensource?

Adel Nehme: Yeah. So one thing I found interesting about NannyML is that you decided to make an open source. Can you walk me behind decision-making here for why making a NannyML open source and what are the pros and cons of going open source as an up-and-coming machine learning package?

Hakim Elakhrass: Yeah. So we were doing, like I said, we were doing a lot of research behind our algorithms and we spent quite a long time working with design partners. So we had like real world data and things of that nature to run experiments and make sure that we can build algorithms that work well. But the feedback we were getting from our users was like, okay, we're data scientists.

And if we're using a novel algorithm, like we need to know how it works. Like we can't just trust you that we're just gonna send, put data into the system and get back some results. Like, we don't know exactly how it works and just trust that our performance is all fine and everything is fine. So that was this kind, consistent feedback that we were getting.

And then from there it was like, okay, our users want to know how this works. And then we're like, okay, if they know how it works, then it might as well be open source. In general also for widespread adoption, like before we were open source, we were kind like this, no name, Belgian startup, having our design partners and like getting feedback was very slow basically because we're working with these big enterprises.

They don't have that much time. Iteration is hard and then we were like, this is not nice. Let's like, let's go open source. And that's I think what makes the most sense for a lot of data science products. So basically we just wanna introduce. Allowing as many people as possible to use it and to give feedback and ultimately building a better product for the data science community.

And I would say the main calm behind being open source is essentially as a startup, you have to find product market fit twice. So for your open source solution, so you have to get mass adoption of your open source solution. And then. After you've done that, or when you're well, on your way to that, you have to build a paid offering.

So you can still exist as a startup. And then you have to find product market fit through the paid offering. so that's definitely a big challenge.

Organization of a Modern Data Team

Adel Nehme: I couldn't agree more. And I love the fact that it's open source like that, that you're leveraging open source to be able to accelerate the feedback loop.

I think that crux of the challenges that anNannyMLis attempting to solve yet is a tension between what data scientists are trained to do versus—expected of them in the real world. There's a lot of data work around pre and post-deployment that is increasingly crossing over to the engineering realm.

And to first start, I'd love to know how you feel a modern data team should be organized. Do you believe that a one size fits all data scientist can do data science and deployment work? Or do you think these capabilities should be splintered within the data?

Hakim Elakhrass: That's a very good question. I think as with most things in data science, it depends. it depends on the size of the team. It depends what they're working on. It depends how advanced the team is already. In bigger companies, when you have dozens of use cases, a specialization becomes necessary and you already see that you have data engineers, data scientists, ML, engineers, a data analyst, BI there's, this whole slew of roles for the data.

And I think that makes a lot of sense. I think, as. Companies become more advanced and more and more models go into production. You're actually going to have it between pre and post-deployment data scientists. And I think that like right now you see a lot that monitoring falls under the roles of the ML engineer.

But I think that ML engineers are gonna move towards mostly the infrastructure and the ops work. And you're gonna have this post-deployment data science test who will take over the models in production from a data science perspective. So again, it's like, how are models performing? Are they providing business impact?

What are the feedback loops and like really doing analysis and working also to improve their models and increase their business impact. And that's a whole set of skills on its own. So you can imagine if you have, I dunno, 10 plus use cases in production, like you can have an individual who's just in charge of all of those use cases.

Once they've been in production, at least that's how I think it'll go. It's the future. So you never know. I could absolutely be totally wrong about this, but yeah, I think that's, what's gonna.

Best Skills for Modern Data Scientists

Adel Nehme: What do you think are the best skills? The modern data scientist should have?

Hakim Elakhrass: I think putting models into production will really set you apart.

Or if you have any experience with trying to build a model and actually put it out into the real world and see what's happening. I just think it's one of those important skills that not that many data scientists have and also really having this feel for the business impact of a model, right?

Like a model is not just it's. Performance metrics like it's technical metrics. It's also like, why are we building this model? What value does this model add? How is this value changing over time? How is it impacting other departments and the people around me? And I think it's like not a technicals, a technical skill without the other way.

It's also just deep intuition about why you're doing what you're doing. I find that important. Yeah. And just being able to detect these changes in performance, understanding these new concepts, like data drift detection. Concept drift and just really knowing them well and being able to use tools that allow you to do that.

Of course, I'm biased in all of this cause I cause L is what I'm working on, but I also think that. Um,yeah.

Top Trends in Data Science

Adel Nehme: Now as we're closing out Hakeem, what are the top trends that you're looking for within this space that you're particularly excited about? Both in post-deployment data science, but also in data science in general.

Hakim Elakhrass: In post-deployment data science. I don't, it's hard to say that's really, really early days. I maybe in more like Mlops type things, like I'm really happy to see all of these kind of frameworks coming out that make the engineering side of putting models in production, much easier. Shout out to our friends at like XML.

That's like this up and coming framework for lops. And they're really great. And I see a lot of the work they're doing and people ask us if we integrate with them. So I really like these kinda tools and they're also open source. So plus one for that, and that makes me really excited that people get models into production and that there are people working on these problems.

So that's really cool. I think in data. In general. I really like anything that's generative. So like Dolly that came out recently and you see all these insane images or G PT three in general, like just the texts that it can produce and things like that. That's like, I don't know. It's super exciting. Also.

I do not, I did not buy all the hype about AI and I don't know there was this Google engineer recently that said that Google AI sent. I'm just like, no, it's not .

Adel Nehme: Yeah, no it's not. Yeah. I don't buy it either. Yeah,

Hakim Elakhrass: yeah. Yeah. I, I don't buy that, but I do find it cool.

Adel Nehme: The AI community is rarely United over things, but this one like was very much so like the AI community is super united over not being this is not sentient. Yeah.

Hakim Elakhrass: Yeah, indeed. Indeed. It's like, dude, it's just, it's a bunch of functions put together, please. . Like you trade the thought, like an insane amount of data. Of course, it will tell you that it doesn't want to die. Or like, it's kinda funny because it's not like it's not weird that AI or machine learning models trained on text or language that's by humans, that it would then exhibit the same behavior as humans when you ask it those questions.

Like, it's very interesting, but yeah, all that generated stuff, I find it super fun, cool, and useful.

Call To Action

Adel Nehme: Likewise. I'm super excited about what's ahead with GPT3 and DALLE. Now Hakeem, before we wrap up, do you have any final call to action.

Hakim Elakhrass: Yeah, I think the first one is a bit cliche, but the data science work is very impactful.

And I think a lot of people don't realize how much it can impact people, and society in general. And actually, it's a pretty powerful tool. And so it's like this cliche don't be evil. Like be conscious of your power as a data scientist and what your models are doing and how it might impact people in society and be conscious of it.

Because I think it's sometimes taken very lightly. It's very funny. I was having a conversation recently with software engineer. He asked, oh yeah, how come? There's not like this. Machine learning system that just analyzes all the cars on the road and detects if someone is a drug driver or I was just laughing and don't that's and he was just like, yeah.

As a software engineer, I never have to.

Adel Nehme : Yeah, indeed. that's why I supported that AI ethics courses in yeah. In different, yeah. In different programs.

Hakim Elakhrass: Yeah, indeed. So I think that's, yeah, just be conscious of it. Don't be evil. And then of course, if you have models in production and you're interesting to monitor the performance. Yeah. Check out NannyML we're on GIS. Totally open source. Be open source forever. Uh, our core algorithms will always be open source. Our research will be open source, so yeah. Great to have anybody in the community that, that finds this interesting.

Adel Nehme: That's awesome. Thank you so much. Hack Hakeem for coming on DataFramed. Yeah,

Hakim Elakhrass: No problem. Thanks, Adel Thanks a lot for having me.

Topics

Data Science

Artificial Intelligence

Data Scientist

blog

How NLP is Changing the Future of Data Science

With the rise of large language models like GPT-3, NLP is producing awe-inspiring results. In this article, we discuss how NLP is driving the future of data science and machine learning, its future applications, risks, and how to mitigate them.

Travis Tang

15 min

blog

Data Science In The Trucking Industry (Transcript)

Discover how data science is being used to revolutionize the trucking industry, from A/B Testing and econometrics to machine learning and self-driving cars.

Hugo Bowne-Anderson

15 min

podcast

Data Science, Past, Present and Future

In this episode, Hugo speaks with Hilary Mason about the past, present, and future of data science.

podcast

Embedded Machine Learning on Edge Devices

Daniel Situnayake talks about his work with EdgeML, the biggest challenges in embedded machine learning, potential use cases of machine learning models in edge devices, and the best tips for aspiring machine learning engineers and data science practices.

podcast

Data Science at McKinsey

Hugo speaks with Taras Gorishnyy, a Senior Analytics Manager at McKinsey and Head of Data Science at QuantumBlack, a McKinsey company, about what it takes to change organizations through data science.

podcast

Interpretable Machine Learning

Serg Masis talks about the different challenges affecting model interpretability in machine learning, how bias can produce harmful outcomes in machine learning systems and the different types of technical and non-technical solutions to tackling bias.

See More See More