Skip to content

Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step and it's important to keep a close eye on the process.

For this project, you're supporting a team that wants to improve the way they're monitoring and controlling a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process is working well. Processes are only adjusted if measurements fall outside of an acceptable range.

This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:

Using SQL window functions, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.

The data

The data is available in the manufacturing_parts table which has the following fields:

  • item_no: the item number
  • length: the length of the item made
  • width: the width of the item made
  • height: the height of the item made
  • operator: the operating machine
Spinner
DataFrameas
df
variable
SELECT *
FROM manufacturing_parts
Spinner
DataFrameas
df1
variable
--Calculate moving average and moving standard deviation
SELECT
	operator,
	ROW_NUMBER() OVER w AS row_number,
	height,
	AVG(height) OVER w AS avg_height,
	STDDEV(height) OVER w AS stddev_height
FROM manufacturing_parts
WINDOW w AS (
	PARTITION BY operator
	ORDER BY item_no
	ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
)
Spinner
DataFrameas
df2
variable
--Create a Common Table Expression (CTE) and filter the table
WITH m AS (
    SELECT
        operator,
        ROW_NUMBER() OVER w AS row_number,
        height,
        AVG(height) OVER w AS avg_height,
        STDDEV(height) OVER w AS stddev_height
    FROM manufacturing_parts
    WINDOW w AS (
        PARTITION BY operator
        ORDER BY item_no
        ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
    )
)

SELECT *
FROM m
WHERE row_number > 4
Spinner
DataFrameas
df3
variable
--Calculate upper and lower control limits
WITH m AS (
	SELECT
		operator,
		ROW_NUMBER() OVER w AS row_number,
		height,
		AVG(height) OVER w AS avg_height,
		STDDEV(height) OVER w AS stddev_height
	FROM manufacturing_parts
	WINDOW w AS (
		PARTITION BY operator
		ORDER BY item_no
		ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
	)
)

SELECT
	*,
	avg_height + 3 * stddev_height / SQRT(5) AS ucl,
	avg_height - 3 * stddev_height / SQRT(5) AS lcl
FROM m
WHERE row_number > 4
Spinner
DataFrameas
alerts
variable
--Creating an alert to evaluate the manufacturing process
WITH m AS (
	SELECT
		operator,
		ROW_NUMBER() OVER w AS row_number,
		height,
		AVG(height) OVER w AS avg_height,
		STDDEV(height) OVER w AS stddev_height
	FROM manufacturing_parts
	WINDOW w AS (
		PARTITION BY operator
		ORDER BY item_no
		ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
	)
)

SELECT
	a.*,
	CASE WHEN a.height NOT BETWEEN a.lcl AND a.ucl
		 THEN TRUE
		 ELSE FALSE
	END AS alert
FROM (
	SELECT
		*,
		avg_height + 3 * stddev_height / SQRT(5) AS ucl,
		avg_height - 3 * stddev_height / SQRT(5) AS lcl
	FROM m
	WHERE row_number > 4
) AS a