Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.
For this project, you're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range.
This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:
The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.
Using SQL window functions and nested queries, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.
The data
The data is available in the manufacturing_parts table which has the following fields:
item_no: the item numberlength: the length of the item madewidth: the width of the item madeheight: the height of the item madeoperator: the operating machine
MY SOLUTION
--This qhuery analyses manufacturing_parts data to identify control limits for products height based on operator, including the current row and a window size of 5 -- 
SELECT operator, row_number, height, avg_height, stddev_height, ucl, lcl, 
       CASE WHEN height NOT BETWEEN lcl AND ucl 
	   THEN TRUE
	   ELSE FALSE
	   END AS alert
FROM (
    SELECT operator,
           height,
           avg_height, 
	       row_number,
           stddev_height,
           avg_height + (3 * (stddev_height/SQRT(5))) AS ucl,
           avg_height - (3 * (stddev_height/SQRT(5))) AS lcl
    FROM (
        SELECT operator, 
               item_no, 
               height, 
               ROW_NUMBER() OVER EVALUATION_PROCESS AS row_number,
		       AVG(height) OVER EVALUATION_PROCESS AS avg_height,
		       STDDEV(height) OVER EVALUATION_PROCESS AS stddev_height
        FROM manufacturing_parts
		WINDOW EVALUATION_PROCESS AS (
			PARTITION BY operator
			ORDER BY item_no
			ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
			)
    ) AS inner_subquery
    WHERE avg_height IS NOT NULL AND stddev_height IS NOT NULL
	AND row_number >=5
) AS outer_subquery;DATACAMP PROGRAMMED SOLUTION
-- Flag whether the height of a product is within the control limits
SELECT
	b.*,
	CASE
		WHEN 
			b.height NOT BETWEEN b.lcl AND b.ucl
		THEN TRUE
		ELSE FALSE
	END as alert
FROM (
	SELECT
		a.*, 
		a.avg_height + 3*a.stddev_height/SQRT(5) AS ucl, 
		a.avg_height - 3*a.stddev_height/SQRT(5) AS lcl  
	FROM (
		SELECT 
			operator,
			ROW_NUMBER() OVER w AS row_number, 
			height, 
			AVG(height) OVER w AS avg_height, 
			STDDEV(height) OVER w AS stddev_height
		FROM manufacturing_parts 
		WINDOW w AS (
			PARTITION BY operator 
			ORDER BY item_no 
			ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
		)
	) AS a
	WHERE a.row_number >= 5
) AS b;
DROP TABLE IF EXISTS alerting;
CREATE TEMP TABLE alerting AS 
  -- This query analyses manufacturing_parts data to identify control limits for products height based on operator, including the current row and a window size of 5 -- 
SELECT operator, row_number, height, avg_height, stddev_height, ucl, lcl, 
       CASE WHEN height NOT BETWEEN lcl AND ucl 
	   THEN TRUE
	   ELSE FALSE
	   END AS alert
FROM (
    SELECT operator,
           height,
           avg_height, 
	       row_number,
           stddev_height,
           avg_height + (3 * (stddev_height/SQRT(5))) AS ucl,
           avg_height - (3 * (stddev_height/SQRT(5))) AS lcl
    FROM (
        SELECT operator, 
               item_no, 
               height, 
               ROW_NUMBER() OVER EVALUATION_PROCESS AS row_number,
		       AVG(height) OVER EVALUATION_PROCESS AS avg_height,
		       STDDEV(height) OVER EVALUATION_PROCESS AS stddev_height
        FROM manufacturing_parts
		WINDOW EVALUATION_PROCESS AS (
			PARTITION BY operator
			ORDER BY item_no
			ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
			)
    ) AS inner_subquery
    WHERE avg_height IS NOT NULL AND stddev_height IS NOT NULL
	AND row_number >=5
) AS outer_subquery;
SELECT DISTINCT COUNT(alert) 
FROM alerting
GROUP BY alert;-- This query analyzes manufacturing_parts data to identify control limits
-- for product height based on operator, including the current row (effective window size of 5).
SELECT operator,
       row_number,
       height,
       avg_height,
       stddev_height,
       avg_height + (3 * (stddev_height / SQRT(5))) AS ucl,
       avg_height - (3 * (stddev_height / SQRT(5))) AS lcl,
       CASE WHEN height < (avg_height - (3 * (stddev_height / SQRT(5)))) OR height > (avg_height + (3 * (stddev_height / SQRT(5)))) THEN FALSE ELSE TRUE END AS alert
FROM (
    SELECT operator,  -- Select relevant columns
           ROW_NUMBER() OVER (ORDER BY item_no) AS row_number,
           height,
           -- Pre-calculate average and standard deviation for all preceding rows
           AVG(height) OVER (  
               PARTITION BY operator
               ORDER BY item_no
               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
           ) AS avg_height,
           STDDEV(height) OVER (  
               PARTITION BY operator
               ORDER BY item_no
               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
           ) AS stddev_height
    FROM manufacturing_parts
) AS inner_subquery
WHERE avg_height IS NOT NULL AND stddev_height IS NOT NULL
AND row_number >= 5 -- Filter out incomplete windows
ORDER BY operator;