Skip to content
0

Predicting Industrial Machine Downtime: Level 2

📖 Background

You work for a manufacturer of high-precision metal components used in aerospace, automotives, and medical device applications. Your company operates three different machines on its shop floor that produce different sized components, so minimizing the downtime of these machines is vital for meeting production deadlines.

Your team wants to use a data-driven approach to predicting machine downtime, so proactive maintenance can be planned rather than being reactive to machine failure. To support this, your company has been collecting operational data for over a year and whether each machine was down at those times.

In this second level, you're going to visualize and examine the data in more detail. This level is aimed towards intermediate learners. If you want to challenge yourself a bit more, check out level three!

Report: Machine Downtime Analysis

1. Correlation Matrix Summary

Upon examining the correlation matrix, I identified several operational factors that have a moderate correlation with machine downtime (Machine_Failure). These factors are:

  • Hydraulic Pressure: I found a moderate positive correlation of 0.56 with downtime. This suggests that as the hydraulic pressure increases, the likelihood of machine failure increases.
  • Torque: The correlation value of 0.41 indicates a moderate relationship between torque and machine failure. Higher torque values seem to be linked with more frequent failures.
  • Spindle Speed: A negative correlation of -0.28 with downtime, suggesting that higher spindle speeds may be somewhat associated with fewer failures.
  • Cutting Force: The negative correlation of -0.45 implies that higher cutting forces may lead to fewer machine failures.

Other parameters such as Spindle Bearing Temperature, Tool Vibration, and Voltage exhibited very weak correlations with downtime, with values close to zero. Coolant Pressure and Air System Pressure also showed weak negative correlations with downtime.


2. Trend Summary of Downtime Over Time

When I analyzed the trend of downtime over time, I observed the following key statistics:

  • The average proportion of downtime (Machine_Failure) across the dataset is approximately 49.43%, indicating that nearly half of the data points correspond to machine failures.
  • The standard deviation of 0.25 reflects a reasonable variability in the downtime occurrences.
  • The minimum and maximum values indicate that downtime ranged from 0% to 100% in the dataset, confirming that some days had no failure, while other days experienced complete failure.

This suggests that downtime is somewhat evenly distributed over time, though there are fluctuations that could be influenced by operational conditions.


3. Summary of Hydraulic Pressure and Coolant Temperature Stats by Downtime Category
  • Hydraulic Pressure:

    • For machines experiencing failure, the average hydraulic pressure is 84.76 bar, with a standard deviation of 26.34 bar. In contrast, for machines with no failure, the average hydraulic pressure is significantly higher at 118.57 bar, with a standard deviation of 23.83 bar.
    • The range of hydraulic pressure values is wider for failed machines, spanning from 50.14 bar to 191 bar.
    • These differences in pressure suggest that lower hydraulic pressures may be linked to higher failure rates, although other operational factors might also be contributing.
  • Coolant Temperature:

    • The average coolant temperature when the machine fails is 19.98°C, while it is slightly lower at 17.09°C when the machine does not fail. The difference is statistically significant, with higher temperatures associated with machine failure.
    • The temperature range for failed machines is from 4.1°C to 36.5°C, suggesting that extreme temperature variations might contribute to failure.
    • My analysis indicates that higher coolant temperatures are linked to an increased risk of machine failure.

4. Monthly Downtime Summary

When I examined the proportion of downtime (Machine_Failure) by month, I noticed the following:

  • Month 7 (July) and Month 11 (November) had the highest downtime, with 100% of the days in these months corresponding to machine failures.
  • The lowest downtime occurred in Month 5 (May), where only 43.1% of the days had failures.

This suggests that some months, particularly July and November, experienced extreme levels of downtime, which could be due to various factors, including seasonal variations or maintenance schedules.


5. Summary of Monthly Failure Proportions

The average monthly failure rate is 61.08%, with failure proportions ranging from 43.1% to 100% across the months. The standard deviation of 22.28% indicates significant fluctuations in failure rates across different months.

The months with the highest failure rates (July and November) warrant further investigation, as they experienced complete machine failure during those periods.


Conclusion

In this analysis, I observed several important connections between machine operational parameters and downtime. Key operational factors such as hydraulic pressure, torque, and cutting force show significant associations with machine failure. The analysis of coolant temperature further supports the idea that higher temperatures contribute to machine downtime. Additionally, seasonal variations in downtime indicate that certain months (e.g., July and November) are more prone to machine failures, which could require closer inspection.

These insights can help inform maintenance and operational strategies to improve machine uptime and reduce the risk of failure.

import pandas as pd

pd.set_option('display.max_columns', None)
downtime = pd.read_csv('data/machine_downtime.csv')
downtime.head()
downtime.info()

1. Explore correlations between the various operational data in the dataset.

import seaborn as sns
import matplotlib.pyplot as plt

print("Dataset Shape:", downtime.shape)
print("Columns:", downtime.columns)

missing_values = downtime.isnull().sum()
print("\nMissing Values:\n", missing_values)

Convert Date column to datetime data type.

downtime['Date'] = pd.to_datetime(downtime['Date'])

downtime['Date'].head()

Let's explore correlation between the various operational data in the dataset.

corr_cols = downtime.drop(columns=['Date', 'Machine_ID', 'Assembly_Line_No', 'Downtime'], axis=1)
corr_cols['Downtime'] = downtime['Downtime'].map({'Machine_Failure': 0, 'No_Machine_Failure': 1})
correlation_matrix = corr_cols.corr()

print(correlation_matrix)

plt.figure(figsize=(12, 8))
sns.heatmap(correlation_matrix, annot=True, fmt='.2f', cmap='coolwarm', cbar=True)
plt.title('Correlation Matrix of Machine Operational Data', fontsize=16)
plt.xticks(rotation=45)
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()

The correlation matrix highlights relationships between the operational features. Here's a summary of my key findings from the data:

Key Observations:

  1. Strong Relationships:

    • Hydraulic Pressure vs Torque: Positive correlation (0.162734) suggests higher hydraulic pressure is associated with greater torque.
    • Spindle Speed vs Cutting Force: Moderate positive correlation (0.230839) implies that increased spindle speed is linked with higher cutting force.
  2. Negative Relationships:

    • Hydraulic Pressure vs Cutting Force: Negative correlation (-0.222217) indicates that higher hydraulic pressure might reduce cutting force.
    • Torque vs Cutting Force: Negative correlation (-0.180184) suggests torque and cutting force may work in opposite conditions.
  3. Low or No Significant Correlation:

    • Many features, like vibrations, show weak correlations with other metrics, which might indicate low direct influence or noise in the data.

2. Do you see a pattern in machine downtime over time?

downtime_by_date = downtime.groupby('Date')['Downtime'].value_counts(normalize=True).unstack().fillna(0)['Machine_Failure'].reset_index()

plt.figure(figsize=(10, 6))
sns.lineplot(data=downtime_by_date, x='Date', y='Machine_Failure', color='red')
plt.title('Proportion of Machine Failure Over Time')
plt.xlabel('Date')
plt.ylabel('Proportion of Machine Failure')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
downtime_by_date = downtime.groupby('Date')['Downtime'].value_counts(normalize=True).unstack().fillna(0)['Machine_Failure'].reset_index()

print("\nProportion of Machine Failure Over Time:")
print(downtime_by_date[['Date', 'Machine_Failure']].head())

trend_summary = downtime_by_date['Machine_Failure'].describe()
print("\nSummary of Downtime Trend Over Time:")
print(trend_summary)

3. Which factors (visually) seem to be connected to machine downtime?

‌
‌
‌