Can you predict the strength of concrete?
๐ Background
You work in the civil engineering department of a major university. You are part of a project testing the strength of concrete samples.
Concrete is the most widely used building material in the world. It is a mix of cement and water with gravel and sand. It can also include other materials like fly ash, blast furnace slag, and additives.
The compressive strength of concrete is a function of components and age, so your team is testing different combinations of ingredients at different time intervals.
The project leader asked you to find a simple way to estimate strength so that students can predict how a particular sample is expected to perform.
๐พ The data
The team has already tested more than a thousand samples (source):
Compressive strength data:
- "cement" - Portland cement in kg/m3
- "slag" - Blast furnace slag in kg/m3
- "fly_ash" - Fly ash in kg/m3
- "water" - Water in liters/m3
- "superplasticizer" - Superplasticizer additive in kg/m3
- "coarse_aggregate" - Coarse aggregate (gravel) in kg/m3
- "fine_aggregate" - Fine aggregate (sand) in kg/m3
- "age" - Age of the sample in days
- "strength" - Concrete compressive strength in megapascals (MPa)
Acknowledgments: I-Cheng Yeh, "Modeling of strength of high-performance concrete using artificial neural networks," Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998).
suppressPackageStartupMessages(library(tidyverse))
df <- readr::read_csv('data/concrete_data.csv', show_col_types = FALSE)
head(df)๐ช Challenge
Provide your project leader with a formula that estimates the compressive strength. Include:
- The average strength of the concrete samples at 1, 7, 14, and 28 days of age.
- The coefficients
, ... , to use in the following formula:
๐งโโ๏ธ Judging criteria
This is a community-based competition. The top 5 most upvoted entries will win.
The winners will receive DataCamp merchandise.
โ
Checklist before publishing
- Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
- Remove redundant cells like the judging criteria, so the workbook is focused on your work.
- Check that all the cells run without error.
โ๏ธ Time is ticking. Good luck!
mlr_strength <- lm(strength ~ cement + slag + fly_ash + water + superplasticizer + coarse_aggregate + fine_aggregate + age ,data=df)
mlr_strength
#y=-23.16376 + 0.11979*cement + 0.10385*slag + 0.08794*fly_ash - 0.15030*water + 0.29069*superplasticizer + 0.01803*coarse_aggregate + 0.02015*fine_aggregate + 0.11423*age
mean_cement <- mean(df$cement)
mean_slag <- mean(df$slag)
mean_fly_ash <- mean(df$fly_ash)
mean_water <- mean(df$water)
mean_superplasticizer <- mean(df$superplasticizer)
mean_coarse_aggregate <- mean(df$coarse_aggregate)
mean_fine_aggregate <- mean(df$fine_aggregate)
aver_age1 <- data.frame(age=1, cement=mean_cement, slag=mean_slag, fly_ash=mean_fly_ash, water=mean_water, superplasticizer=mean_superplasticizer, coarse_aggregate=mean_coarse_aggregate, fine_aggregate=mean_fine_aggregate)
pred_strength1 <- predict(mlr_strength, aver_age1)
pred_strength1
aver_age7 <- data.frame(age=7, cement=mean_cement, slag=mean_slag, fly_ash=mean_fly_ash, water=mean_water, superplasticizer=mean_superplasticizer, coarse_aggregate=mean_coarse_aggregate, fine_aggregate=mean_fine_aggregate)
pred_strength7 <- predict(mlr_strength, aver_age7)
pred_strength7
aver_age14 <- data.frame(age=14, cement=mean_cement, slag=mean_slag, fly_ash=mean_fly_ash, water=mean_water, superplasticizer=mean_superplasticizer, coarse_aggregate=mean_coarse_aggregate, fine_aggregate=mean_fine_aggregate)
pred_strength14 <- predict(mlr_strength, aver_age14)
pred_strength14
aver_age28 <- data.frame(age=28, cement=mean_cement, slag=mean_slag, fly_ash=mean_fly_ash, water=mean_water, superplasticizer=mean_superplasticizer, coarse_aggregate=mean_coarse_aggregate, fine_aggregate=mean_fine_aggregate)
pred_strength28 <- predict(mlr_strength, aver_age28)
pred_strength28