Skip to content

Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step and it's important to keep a close eye on the process.

For this project, you're supporting a team that wants to improve the way they're monitoring and controlling a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process is working well. Processes are only adjusted if measurements fall outside of an acceptable range.

This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:

Using SQL window functions, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.

The data

The data is available in the manufacturing_parts table which has the following fields:

  • item_no: the item number
  • length: the length of the item made
  • width: the width of the item made
  • height: the height of the item made
  • operator: the operating machine
Spinner
DataFrameas
alerts
variable
-- Flag whether the height of a product is within the control limits
SELECT
	b.*,
	CASE
		WHEN 
			b.height NOT BETWEEN b.lcl AND b.ucl
		THEN TRUE
		ELSE FALSE
	END as alert
-- Calculate moving average and moving standard deviation 
FROM (
	SELECT
		a.*, 
	-- Calculating upper and lower limits by creating a subquery
		a.avg_height + 3*a.stddev_height/SQRT(5) AS ucl, 
		a.avg_height - 3*a.stddev_height/SQRT(5) AS lcl  
	FROM (
		SELECT 
			operator,
		-- Adding Row Numbers
			ROW_NUMBER() OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS row_number, 
			height, 
		-- Calculate Summary Statistics
			AVG(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS avg_height, 
			STDDEV(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS stddev_height
		FROM manufacturing_parts 
	) AS a
	WHERE a.row_number >= 5
) AS b