Brett Lantz, 7 Aug 2017

Brett Lantz is a data scientist at the University of Michigan and the author of Machine Learning with R. After training as a sociologist, Brett has applied his endless thirst for data to projects that involve understanding and predicting human behavior.


Nina Zumel & John Mount, 20 June 2017

Richie chats to Nina and John about their favorite types of regression, statistics vs. machine learning, running Win-Vector, interacting with data scientists vs. interacting with managers, business constraints on models, the vtreat R package, bangra dancing, and life in San Francisco.


Hank Roark, 11 February 2017

Hank Roark answers Nick's questions about his career into data science, job automation, the importance of communicating your results as a data scientist in both directions and much more!


Andreas Müller, 27 January 2017

Andy is a lecturer at the Data Science Institute at Columbia University and author of the O'Reilly book "Introduction to machine learning with Python", describing a practical approach to machine learning with python and scikit-learn. He is one of the core developers of the scikit-learn machine learning library, and he has been co-maintaining it for several years. He is also a Software Carpentry instructor. In the past, he worked at the NYU Center for Data Science on open source and open science, and as a Machine Learning Scientist at Amazon. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize the access to high-quality machine learning algorithms. Here, Andy answers questions about his work at Columbia, gives advice to people starting with data science and answers what the most difficult part of his job is.


Peter Bull, 25 January 2017

Peter is a co-founder of DrivenData. He earned his master's in Computational Science and Engineering from Harvard’s School of Engineering and Applied Sciences. His work lies at the intersection of statistics and computer science, and he wants to help bring powerful new modeling techniques to the organizations that need them most. He previously worked as a software engineer at Microsoft and earned a BA in philosophy from Yale University. Here, Peter and Hugo discuss why use python for data science, the business case for data, DrivenData competitions on using yelp data to predict restaurant sanitary ratings and much more.


Ben Wilson, 20 December 2016

Ben is a machine learning specialist and the director of research at He is passionate about learning and has worked as a data scientist in real-time bidding, e-commerce, and recommendation. Ben holds a PhD in mathematics and a degree in computer science.


Max Kuhn, 9 September 2016

In this episode of DataChats, Nick talks with Max Kuhn, the creator of the caret package for R. Max is a frequent speaker at many of the main data science conferences and is well known as the creator of the caret package for R, an essential tool in every R user’s machine learning toolbox. In this 30 min conversation, Max talks about how he originally wanted to become a journalist, why he defines himself as a statistician rather than a data scientist, his thoughts on deep learning, strategies for breaking into the field, and much more.

