project
Give Life: Predict Blood Donations
Build a binary classifier to predict if a blood donor is likely to donate again.
Start Project for Free11 Tasks1,500 XP6,709
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.Training 2 or more people?Try DataCamp For Business
Loved by learners at thousands of companies
Project Description
"Blood is the most precious gift that anyone can give to another person — the gift of life." ~ World Health Organization
Forecasting blood supply is a serious and recurrent problem for blood collection managers: in January 2019, "Nationwide, the Red Cross saw 27,000 fewer blood donations over the holidays than they see at other times of the year." Machine learning can be used to learn the patterns in the data to help to predict future blood donations and therefore save more lives.
In this Project, you will work with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. Your task will be to predict if a blood donor will donate within a given time window. You will look at the full model-building process: from inspecting the dataset to using the tpot
library to automate your Machine Learning pipeline.
Project Tasks
- 1Inspecting transfusion.data file
- 2Loading the blood donations data
- 3Inspecting transfusion DataFrame
- 4Creating target column
- 5Checking target incidence
- 6Splitting transfusion into train and test datasets
- 7Selecting model using TPOT
- 8Checking the variance
- 9Log normalization
- 10Training the logistic regression model
- 11Conclusion
Technologies
Python
Dimitri Denisjonok
See MorePython Backend Developer at Futrli
After graduating with a degree in Mathematics from the University of Nottingham (UK), Dimitri decided to pursue software engineering in Python. He has worked on various data related projects: from data visualizations to building data pipelines and machine learning models. His main interest is in Applied Data Science and how businesses and individuals can become more data-driven in their decision making.