Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.
For this project, you're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range.
This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:
The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.
Using SQL window functions and nested queries, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.
The data
The data is available in the manufacturing_parts table which has the following fields:
item_no: the item numberlength: the length of the item madewidth: the width of the item madeheight: the height of the item madeoperator: the operating machine
-- This SQL query is designed to monitor the height of manufacturing parts and generate an alert if the height is outside control limits.
-- The query is structured in a nested manner with multiple subqueries.
-- The outermost query selects all columns from the subquery 'b' and adds an 'alert' column.
-- The 'alert' column is a boolean that is TRUE if the height is outside the control limits (ucl and lcl), and FALSE otherwise.
SELECT
b.*,
CASE
WHEN
b.height NOT BETWEEN b.lcl AND b.ucl
THEN TRUE
ELSE FALSE
END as alert
FROM (
-- The middle subquery 'b' calculates the upper control limit (ucl) and lower control limit (lcl) for each row.
-- These limits are based on the average height and standard deviation of the height over a rolling window of 5 rows.
SELECT
a.*,
a.avg_height + 3*a.stddev_height/SQRT(5) AS ucl,
a.avg_height - 3*a.stddev_height/SQRT(5) AS lcl
FROM (
-- The innermost subquery 'a' calculates the average height and standard deviation of the height for each operator over a rolling window of 5 rows.
-- It also assigns a row number to each row within the partition of each operator.
SELECT
operator,
ROW_NUMBER() OVER w AS row_number,
height,
AVG(height) OVER w AS avg_height,
STDDEV(height) OVER w AS stddev_height
FROM manufacturing_parts
-- The window specification 'w' partitions the data by operator and orders it by item_no.
-- It defines a rolling window of 5 rows (4 preceding rows and the current row).
WINDOW w AS (
PARTITION BY operator
ORDER BY item_no
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
)
) AS a
-- The WHERE clause ensures that only rows with a row number of 5 or greater are included in the result.
-- This is because the rolling window requires at least 5 rows to calculate the statistics.
WHERE a.row_number >= 5
) AS b;