Skip to content
New Workbook
Sign up
redicting_Industrial_Machine_Downtime_Level2
0

Predicting Industrial Machine Downtime: Level 2

📖 Background

You work for a manufacturer of high-precision metal components used in aerospace, automotives, and medical device applications. Your company operates three different machines on its shop floor that produce different sized components, so minimizing the downtime of these machines is vital for meeting production deadlines.

Your team wants to use a data-driven approach to predicting machine downtime, so proactive maintenance can be planned rather than being reactive to machine failure. To support this, your company has been collecting operational data for over a year and whether each machine was down at those times.

In this second level, you're going to visualize and examine the data in more detail. This level is aimed towards intermediate learners. If you want to challenge yourself a bit more, check out level three!

💾 The data

The company has stored the machine operating data in a single table, available in 'data/machine_downtime.csv'.

Each row in the table represents the operational data for a single machine on a given day:
  • "Date" - the date the reading was taken on.
  • "Machine_ID" - the unique identifier of the machine being read.
  • "Assembly_Line_No" - the unique identifier of the assembly line the machine is located on.
  • "Hydraulic_Pressure(bar)", "Coolant_Pressure(bar)", and "Air_System_Pressure(bar)" - pressure measurements at different points in the machine.
  • "Coolant_Temperature", "Hydraulic_Oil_Temperature", and "Spindle_Bearing_Temperature" - temperature measurements (in Celsius) at different points in the machine.
  • "Spindle_Vibration", "Tool_Vibration", and "Spindle_Speed(RPM)" - vibration (measured in micrometers) and rotational speed measurements for the spindle and tool.
  • "Voltage(volts)" - the voltage supplied to the machine.
  • "Torque(Nm)" - the torque being generated by the machine.
  • "Cutting(KN)" - the cutting force of the tool.
  • "Downtime" - an indicator of whether the machine was down or not on the given day.

1. Introduction

In the competitive landscape of high-precision manufacturing, ensuring the reliability of production machinery is paramount. Machine downtime can lead to significant delays, increased costs, and compromised product quality. This report leverages over a year of operational data from three different machines to analyze and predict downtime occurrences. By identifying patterns and key operational indicators associated with machine failures, the company can implement proactive maintenance measures to enhance operational efficiency and reduce unplanned downtimes.

2. Data Preparation and Cleaning

2.1. Data Loading and Initial Inspection

We begin by loading the dataset and performing an initial inspection to understand its structure and identify any immediate data quality issues.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
downtime = pd.read_csv('data/machine_downtime.csv')

# Display the first few rows to verify successful loading
print(downtime.head())

# Check for missing values to ensure data integrity
print(downtime.isnull().sum())

Initial Observations:

  • The dataset contains 16 columns representing various operational metrics.
  • There are missing values in several numerical columns, particularly in Torque(Nm) and Cutting(kN).

2.2. Date Parsing

To facilitate time-based analyses, we convert the 'Date' column to datetime format.

# Convert 'Date' column to datetime format
downtime['Date'] = pd.to_datetime(downtime['Date'], format='%d-%m-%Y')

2.3. Handling Missing Values

Missing values can skew analysis and model predictions. We impute missing numerical values with the mean of their respective columns.

# Define numerical columns with corrected column name 'Cutting(KN)'
numerical_cols = [
    'Hydraulic_Pressure(bar)',
    'Coolant_Pressure(bar)',
    'Air_System_Pressure(bar)',
    'Coolant_Temperature',
    'Hydraulic_Oil_Temperature',
    'Spindle_Bearing_Temperature',
    'Spindle_Vibration',
    'Tool_Vibration',
    'Spindle_Speed(RPM)',
    'Voltage(volts)',
    'Torque(Nm)',
    'Cutting(kN)'  # Updated column name
]

# **Optional:** Verify column names to ensure correctness
print("\nDataFrame Columns:")
print(downtime.columns.tolist())

# Impute numerical columns with mean
downtime[numerical_cols] = downtime[numerical_cols].fillna(downtime[numerical_cols].mean())

# Verify that there are no more missing values
print("\nAfter Imputation:")
print(downtime.isnull().sum())

2.4. Column Renaming

For consistency and clarity, we standardize column names.

# Rename 'Cutting(kN)' to 'Cutting(KN)' for consistency
downtime.rename(columns={'Cutting(kN)': 'Cutting(KN)'}, inplace=True)

# Verify the renaming was successful
print("\nUpdated Columns After Renaming:")
print(downtime.columns.tolist())

3. Exploratory Data Analysis (EDA)

EDA is pivotal in uncovering patterns, correlations, and insights that inform predictive modeling.

‌
‌
‌