Skip to main content

Post-Deployment Data Science

August 7, 2022
LinkedIn
Facebook
Twitter
Copy

Hakim Elakhrass talks about post-deployment data science, the real-world use cases for tools like NannyML, the potentially catastrophic effects of unmonitored models in production, the most important skills for modern data scientists to cultivate, and more.

View Transcript

Key Quotes

Discrimination and bias in a model is unethical and the impact can be catastrophic to a business. Unfortunately, this can simply be that when you built your model that you didn't see bias in certain demographics because you didn't have enough of them in your data. Over time, more and more of a certain demographic enters your data that the model can't properly make good decisions for. That is extremely detrimental from a financial and business perspective, because if your model is discriminating against a certain segment, then you're obviously not doing the best for the company. Worst of all, it’s not fair to the people you're making predictions about.

Actually putting models into production is what will set you apart as a data scientist. it's an important skill that, unfortunately, not many data scientists have. They should also really have a grasp of the business impact of the model. A model is more than just its performance or technical metrics. Why are you building this model? What value does it add and how is that value changing over time? How is it impacting other departments? Obviously data scientists need technical skills, but they must also have a deep intuition about why they are doing what they are doing.

Key Takeaways

1

Whether or not you know what actually happens in the real world after the prediction, understanding model performance is still challenging from both an engineering perspective and a data science perspective.

2

Data scientists need to cultivate a thorough understanding of a model’s potential business impacts as well as the technical metrics of the model.

3

Making machine learning tools open source builds trust with users and enables a community-based approach for getting feedback.

About Hakim Elakhrass

Photo of Hakim Elakhrass

Hakim Elakhrass is the Co-Founder and CEO of NannyML, an open-source python library that allows data scientists to estimate post-deployment model performance, detect data drift, and link data drift alerts back to model performance changes. Originally, Hakim started a machine learning consultancy with his NannyML co-founders, and the need for monitoring quickly arose, leading to the development of NannyML.

connect:
LinkedIn
Photo of Adel Nehme
Meet our host
Adel Nehme

Adel is a Data Science educator, speaker, and Evangelist at DataCamp where he has released various courses and live training on data analysis, machine learning, and data engineering. He is passionate about spreading data skills and data literacy throughout organizations and the intersection of technology and society. He has an MSc in Data Science and Business Analytics. In his free time, you can find him hanging out with his cat Louis.

← Back to podcasts