Richie chats to Nina and John about their favorite types of regression, statistics vs. machine learning, running Win-Vector, interacting with data scientists vs. interacting with managers, business constraints on models, the vtreat R package, bangra dancing, and life in San Francisco.
Andy is a lecturer at the Data Science Institute at Columbia University and author of the O'Reilly book "Introduction to machine learning with Python", describing a practical approach to machine learning with python and scikit-learn. He is one of the core developers of the scikit-learn machine learning library, and he has been co-maintaining it for several years. He is also a Software Carpentry instructor. In the past, he worked at the NYU Center for Data Science on open source and open science, and as a Machine Learning Scientist at Amazon. His mission is to create open tools to lower the barrier of entry for machine learning applications, promote reproducible science and democratize the access to high-quality machine learning algorithms. Here, Andy answers questions about his work at Columbia, gives advice to people starting with data science and answers what the most difficult part of his job is.
Peter is a co-founder of DrivenData. He earned his master's in Computational Science and Engineering from Harvard’s School of Engineering and Applied Sciences. His work lies at the intersection of statistics and computer science, and he wants to help bring powerful new modeling techniques to the organizations that need them most. He previously worked as a software engineer at Microsoft and earned a BA in philosophy from Yale University. Here, Peter and Hugo discuss why use python for data science, the business case for data, DrivenData competitions on using yelp data to predict restaurant sanitary ratings and much more.
Ben is a machine learning specialist and the director of research at lateral.io. He is passionate about learning and has worked as a data scientist in real-time bidding, e-commerce, and recommendation. Ben holds a PhD in mathematics and a degree in computer science.
In this episode of DataChats, Nick talks with Max Kuhn, the creator of the caret package for R. Max is a frequent speaker at many of the main data science conferences and is well known as the creator of the caret package for R, an essential tool in every R user’s machine learning toolbox. In this 30 min conversation, Max talks about how he originally wanted to become a journalist, why he defines himself as a statistician rather than a data scientist, his thoughts on deep learning, strategies for breaking into the field, and much more.
Join 1,600,000 Data Science Enthusiasts today!Create Free Account Now Get Full Access