Skip to content
0

Can you predict the strength of concrete?

πŸ“– Background

You work in the civil engineering department of a major university. You are part of a project testing the strength of concrete samples.

Concrete is the most widely used building material in the world. It is a mix of cement and water with gravel and sand. It can also include other materials like fly ash, blast furnace slag, and additives.

The compressive strength of concrete is a function of components and age, so your team is testing different combinations of ingredients at different time intervals.

The project leader asked you to find a simple way to estimate strength so that students can predict how a particular sample is expected to perform.

πŸ’Ύ The data

The team has already tested more than a thousand samples (source):

Compressive strength data:
  • "cement" - Portland cement in kg/m3
  • "slag" - Blast furnace slag in kg/m3
  • "fly_ash" - Fly ash in kg/m3
  • "water" - Water in liters/m3
  • "superplasticizer" - Superplasticizer additive in kg/m3
  • "coarse_aggregate" - Coarse aggregate (gravel) in kg/m3
  • "fine_aggregate" - Fine aggregate (sand) in kg/m3
  • "age" - Age of the sample in days
  • "strength" - Concrete compressive strength in megapascals (MPa)

Acknowledgments: I-Cheng Yeh, "Modeling of strength of high-performance concrete using artificial neural networks," Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998).

import pandas as pd
df = pd.read_csv('data/concrete_data.csv')
df.head()

πŸ’ͺ Challenge

Provide your project leader with a formula that estimates the compressive strength. Include:

  1. The average strength of the concrete samples at 1, 7, 14, and 28 days of age.
  2. The coefficients , ... , to use in the following formula:

πŸ§‘β€βš–οΈ Judging criteria

This is a community-based competition. The top 5 most upvoted entries will win.

The winners will receive DataCamp merchandise.

βœ… Checklist before publishing

  • Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
  • Remove redundant cells like the judging criteria, so the workbook is focused on your work.
  • Check that all the cells run without error.

βŒ›οΈ Time is ticking. Good luck!

import pandas as pd
import plotly.express as px
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.graph_objects as go
df.info()
  • 1030 entries
  • No null values
  • integers and floats only
  • Output is strength
m = {'duplicated': ['Yes', 'No'], 'count': [df.duplicated().sum(), len(df)-df.duplicated().sum()]}
df_miss = pd.DataFrame(data=m)
df_miss
fig = px.pie(df_miss, values='count', names='duplicated',hole=0.3,\
             color_discrete_sequence=px.colors.sequential.Blackbody 
            ,width=500, height=500)
fig.show()
df.drop_duplicates(inplace=True)
df.columns
fig = px.box(df, x=df.columns \
           )
fig.update_xaxes(title_text='')
fig.update_yaxes(title_text='')


fig.show()
β€Œ
β€Œ
β€Œ