Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.
For this project, we're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range.
This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:
The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.
Using SQL window functions and nested queries, we'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.
The data
The data is available in the manufacturing_parts
table which has the following fields:
item_no
: the item numberlength
: the length of the item madewidth
: the width of the item madeheight
: the height of the item madeoperator
: the operating machine
Objectives
- Create an alert that flags whether the height of a product is within the control limits for each operator using the formulas provided in the workbook.
- The final query should return an alert column as a boolean flag, depending on if the height within or outside the boundries of the upper and lower control limits
- The average and standard deviation function should use a window function of length 5 to calculate the control limits, considering rows up to and including the current row; incomplete windows will not be considered from the final query output.
-- Query to create the boolean column for alert using the CASE clause
SELECT t1.*,
CASE WHEN (t1.height < t1.ucl)AND(t1.height >t1.lcl) THEN FALSE
ELSE TRUE END AS alert
-- Subquery used to calculate the upper and lower limits using the formula described above
FROM (SELECT t0.*,
cavg_height + 3*(t0.stddev_height/SQRT(5)) AS ucl,
avg_height - 3*(t0.stddev_height/SQRT(5)) AS lcl
-- Subquery to select the columns of interests from the original dataset
FROM (SELECT operator,
ROW_NUMBER() OVER w AS row_number,
height,
AVG(height) OVER w AS avg_height,
STDDEV(height) OVER w AS stddev_height
FROM manufacturing_parts
-- Window function used to calculate the row number, average height and the height's standard deviation as a running calculation of the previous four observations, plus the current row. All partitioned by each operator unit.*/
WINDOW w AS (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)) AS t0) AS t1
--Where clause used to filter for results that considered 5 rows in total in the running calculation
WHERE row_number >4;
Conclusions
- Out of 420 records, only 57 showed True alert values, meaning that they were beyond the height limits.
- Operator Op-5 showed the highest amount of items beyond the height limits with 6 cases.
- All operators had at least one item out of the height limits.
- Operators 6, 11, 12, 18 showed the least amount of items beyond of the height limits with only 1 case each.