Skip to content

Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.

For this project, you're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range.

This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:

The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.

Using SQL window functions and nested queries, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.

The data

The data is available in the manufacturing_parts table which has the following fields:

  • item_no: the item number
  • length: the length of the item made
  • width: the width of the item made
  • height: the height of the item made
  • operator: the operating machine
Spinner
DataFrameas
alerts
variable
-- Write your query here
SELECT
	ucl_lcl.*,
	CASE 
		WHEN 
			ucl_lcl.height BETWEEN ucl_lcl.lcl AND ucl_lcl.ucl
		THEN FALSE
		ELSE TRUE
	END AS alert
	
FROM ( -- FROM STEP 2

	SELECT
		op.*,
		op.avg_height + 3*op.stddev_height/SQRT(5) AS ucl,
		op.avg_height - 3*op.stddev_height/SQRT(5) AS lcl

	FROM ( -- FROM STEP 1
		SELECT 
			man.operator, 
			ROW_NUMBER() OVER w,
			height,
			AVG(height) OVER w AS avg_height,
			STDDEV(height) OVER w AS stddev_height

		FROM public.manufacturing_parts AS man

		WINDOW w AS (
			PARTITION BY man.operator
			ORDER BY man.item_no
			ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
		)
	) AS op
	
	WHERE op.row_number >= 5
	
) AS ucl_lcl;
Spinner
DataFrameas
df
variable
--- STEP 1. Calculate moving average and moving standard deviation
-- 1. Creatign a window 


SELECT 
	man.operator, 
	ROW_NUMBER() OVER w,
	height,
	AVG(height) OVER w AS avg_height,
	STDDEV(height) OVER w AS stddev_height
	
FROM public.manufacturing_parts AS man

WINDOW w AS (
	PARTITION BY man.operator
	ORDER BY man.item_no
	ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
)
Spinner
DataFrameas
df1
variable
-- STEP 2. Calculating upper and lower control limits

-- 1. Create subquery
-- 2. Do Math


SELECT
	op.*,
	op.avg_height + 3*op.stddev_height/SQRT(5) AS ucl,
	op.avg_height - 3*op.stddev_height/SQRT(5) AS lcl
	
FROM ( -- FROM STEP 1
	SELECT 
		man.operator, 
		ROW_NUMBER() OVER w,
		height,
		AVG(height) OVER w AS avg_height,
		STDDEV(height) OVER w AS stddev_height

	FROM public.manufacturing_parts AS man

	WINDOW w AS (
		PARTITION BY man.operator
		ORDER BY man.item_no
		ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
	)
) AS op;
Spinner
DataFrameas
df2
variable
-- STEP 3. Creating an alert to evaluate the manufacturing process

-- 1. nested queries
-- 2. Creating a boolean field
-- 3. filtering with where
-- 4. Saving the result

SELECT
	ucl_lcl.*,
	CASE 
		WHEN 
			ucl_lcl.height BETWEEN ucl_lcl.lcl AND ucl_lcl.ucl
		THEN TRUE
		ELSE FALSE
	END AS alert
	
FROM ( -- FROM STEP 2

	SELECT
		op.*,
		op.avg_height + 3*op.stddev_height/SQRT(5) AS ucl,
		op.avg_height - 3*op.stddev_height/SQRT(5) AS lcl

	FROM ( -- FROM STEP 1
		SELECT 
			man.operator, 
			ROW_NUMBER() OVER w,
			height,
			AVG(height) OVER w AS avg_height,
			STDDEV(height) OVER w AS stddev_height

		FROM public.manufacturing_parts AS man

		WINDOW w AS (
			PARTITION BY man.operator
			ORDER BY man.item_no
			ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
		)
	) AS op
	
	WHERE row_number >= 5
	
) AS ucl_lcl;